1
|
Selvakumar R, Jat GS, Manjunathagowda DC. Allele mining through TILLING and EcoTILLING approaches in vegetable crops. PLANTA 2023; 258:15. [PMID: 37311932 DOI: 10.1007/s00425-023-04176-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Accepted: 06/01/2023] [Indexed: 06/15/2023]
Abstract
MAIN CONCLUSION The present review illustrates a comprehensive overview of the allele mining for genetic improvement in vegetable crops, and allele exploration methods and their utilization in various applications related to pre-breeding of economically important traits in vegetable crops. Vegetable crops have numerous wild descendants, ancestors and terrestrial races that could be exploited to develop high-yielding and climate-resilient varieties resistant/tolerant to biotic and abiotic stresses. To further boost the genetic potential of economic traits, the available genomic tools must be targeted and re-opened for exploitation of novel alleles from genetic stocks by the discovery of beneficial alleles from wild relatives and their introgression to cultivated types. This capability would be useful for giving plant breeders direct access to critical alleles that confer higher production, improve bioactive compounds, increase water and nutrient productivity as well as biotic and abiotic stress resilience. Allele mining is a new sophisticated technique for dissecting naturally occurring allelic variants in candidate genes that influence important traits which could be used for genetic improvement of vegetable crops. Target-induced local lesions in genomes (TILLINGs) is a sensitive mutation detection avenue in functional genomics, particularly wherein genome sequence information is limited or not available. Population exposure to chemical mutagens and the absence of selectivity lead to TILLING and EcoTILLING. EcoTILLING may lead to natural induction of SNPs and InDels. It is anticipated that as TILLING is used for vegetable crops improvement in the near future, indirect benefits will become apparent. Therefore, in this review we have highlighted the up-to-date information on allele mining for genetic enhancement in vegetable crops and methods of allele exploration and their use in pre-breeding for improvement of economic traits.
Collapse
Affiliation(s)
- Raman Selvakumar
- ICAR-Indian Agricultural Research Institute, Pusa Campus, New Delhi, 110 012, India
| | - Gograj Singh Jat
- ICAR-Indian Agricultural Research Institute, Pusa Campus, New Delhi, 110 012, India.
| | | |
Collapse
|
2
|
Shah WA, Jan A, Khan MA, Saeed M, Rahman N, Afridi MS, Khuda F, Akbar R. Association between Aldosterone Synthase ( CYP11B2) Gene Polymorphism and Hypertension in Pashtun Ethnic Population of Khyber Pakhtunkwha, Pakistan. Genes (Basel) 2023; 14:1184. [PMID: 37372364 DOI: 10.3390/genes14061184] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2023] [Revised: 05/20/2023] [Accepted: 05/23/2023] [Indexed: 06/29/2023] Open
Abstract
Genome-wide association studies significantly increased the number of hypertension risk variants; however, most of them focused on European societies. There is lack of such studies in developing countries, including Pakistan. The lack of research studies and the high prevalence of hypertension in the Pakistani community prompted us to design this study. Aldosterone synthase (CYP11B2) was thoroughly studied in different ethnic groups; however, no such study has been conducted in the Pashtun population of Khyber Pakhtunkhwa, Pakistan. In essential hypertension, the aldosterone synthase gene (CYP11B2) plays a significant role. Aldosterone synthesis is affected by both hereditary and environmental factors. Aldosterone synthase (encoded by the CYP11B2 gene) controls the conversion of deoxycorticosterone to aldosterone and, thus, has genetic influences. Polymorphisms in the CYP11B2 gene are linked to an increased risk of hypertension. Previous research on the polymorphism of the aldosterone synthase (CYP11B2) gene and its relationship to hypertension produced inconclusive results. The present study investigates the relationship between CYP11B2 gene polymorphism and hypertension in Pakistan's Pashtun population. We used the nascent exome sequencing method to identify variants associated with hypertension. The research was divided into two phases. In phase one, DNA samples from 200 adult hypertension patients (of age ≥ 30 years) and 200 controls were pooled (n = 200/pool) and subjected to Exome Sequencing. In the second phase, the WES reported SNPs were genotyped using the Mass ARRAY technique to verify and confirm the association between WES-identified SNPs and hypertension. WES identified a total of eight genetic variants in the CYP11B2 gene. The chi-square test and logistic regression analysis were used to estimate the minor allele frequencies (MAFs) and chosen SNPs relationships with hypertension. The frequency of minor allele T was found to be higher in cases compared to the control (42% vs. 30%: p = 0.001) for rs1799998 of CYP11B2 gene, while no significant results (p > 0.05) were observed for the remaining SNPs; rs4536, rs4537, rs4545, rs4543, rs4539, rs4546 and rs6418 showed no positive association with HTN in the studied population (all p > 0.05). Our study findings suggest that rs1799998 increases susceptibly to HTN in the Pashtun population of KP, Pakistan.
Collapse
Affiliation(s)
- Waheed Ali Shah
- Department of Pharmacy, University of Peshawar, Peshawar 25000, Pakistan
| | - Asif Jan
- Department of Pharmacy, University of Peshawar, Peshawar 25000, Pakistan
- District Headquarter Hospital (DHQH) Charsadda 24430, Pakistan
| | | | - Muhammad Saeed
- Department of Pharmacy, Qurtaba University of Science and Technology, Peshawar 25000, Pakistan
| | - Naveed Rahman
- Department of Pharmacy, University of Peshawar, Peshawar 25000, Pakistan
| | - Muhammad Sajjad Afridi
- Department of Pharmacy, Qurtaba University of Science and Technology, Peshawar 25000, Pakistan
| | - Fazli Khuda
- Department of Pharmacy, University of Peshawar, Peshawar 25000, Pakistan
| | - Rani Akbar
- Department of Pharmacy, Abdul Wali Khan University, Mardan 23200, Pakistan
| |
Collapse
|
3
|
Guirao‐Rico S, González J. Benchmarking the performance of Pool-seq SNP callers using simulated and real sequencing data. Mol Ecol Resour 2021; 21:1216-1229. [PMID: 33534960 PMCID: PMC8251607 DOI: 10.1111/1755-0998.13343] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2020] [Revised: 12/21/2020] [Accepted: 01/27/2021] [Indexed: 12/13/2022]
Abstract
Population genomics is a fast-developing discipline with promising applications in a growing number of life sciences fields. Advances in sequencing technologies and bioinformatics tools allow population genomics to exploit genome-wide information to identify the molecular variants underlying traits of interest and the evolutionary forces that modulate these variants through space and time. However, the cost of genomic analyses of multiple populations is still too high to address them through individual genome sequencing. Pooling individuals for sequencing can be a more effective strategy in Single Nucleotide Polymorphism (SNP) detection and allele frequency estimation because of a higher total coverage. However, compared to individual sequencing, SNP calling from pools has the additional difficulty of distinguishing rare variants from sequencing errors, which is often avoided by establishing a minimum threshold allele frequency for the analysis. Finding an optimal balance between minimizing information loss and reducing sequencing costs is essential to ensure the success of population genomics studies. Here, we have benchmarked the performance of SNP callers for Pool-seq data, based on different approaches, under different conditions, and using computer simulations and real data. We found that SNP callers performance varied for allele frequencies up to 0.35. We also found that SNP callers based on Bayesian (SNAPE-pooled) or maximum likelihood (MAPGD) approaches outperform the two heuristic callers tested (VarScan and PoolSNP), in terms of the balance between sensitivity and FDR both in simulated and sequencing data. Our results will help inform the selection of the most appropriate SNP caller not only for large-scale population studies but also in cases where the Pool-seq strategy is the only option, such as in metagenomic or polyploid studies.
Collapse
Affiliation(s)
- Sara Guirao‐Rico
- Institute of Evolutionary BiologyCSIC‐Universitat Pompeu FabraBarcelonaSpain
| | - Josefa González
- Institute of Evolutionary BiologyCSIC‐Universitat Pompeu FabraBarcelonaSpain
| |
Collapse
|
4
|
Jan A, Saeed M, Afridi MH, Khuda F, Shabbir M, Khan H, Ali S, Hassan M, Akbar R. Association of HLA-B Gene Polymorphisms with Type 2 Diabetes in Pashtun Ethnic Population of Khyber Pakhtunkhwa, Pakistan. J Diabetes Res 2021; 2021:6669731. [PMID: 34258292 PMCID: PMC8254654 DOI: 10.1155/2021/6669731] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/03/2021] [Revised: 04/20/2021] [Accepted: 06/07/2021] [Indexed: 11/26/2022] Open
Abstract
Human leukocyte antigen (HLA) system is the most polymorphic and gene dense region of human DNA that has shown many disease associations. It has been further divided into HLA classes I, II, and III. Polymorphism in HLA class II genes has been reported to play an important role in the pathogenesis of type 1 diabetes (T1D). It also showed association with T2D in different ethnic populations. However, a little is known about the relationship of HLA class I gene polymorphism and T2D. This study has evaluated the association of HLA-B (class I gene) variants with T2D in Pashtun ethnic population of Khyber Pakhtunkhwa. In the first phase of the study, whole exome sequencing (WES) of 2 pooled DNA samples was carried out, and DNA pools used were constructed from 100 diabetic cases and 100 control subjects. WES results identified a total of n = 17 SNPs in HLA-B gene. In the next phase, first 5 out of n = 17 reported SNPs were genotyped using MassARRAY® system in order to validate WES results and to confirm association of selected SNPs with T2D. Minor allele frequencies (MAFs) and selected SNPs×T2D association were determined using chi-square test and logistic regression analysis. The frequency of minor C allele was significantly higher in the T2D group as compared to control group (45.0% vs. 13.0%) (p = 0.006) for rs2308655 in HLA-B gene. No significant difference in MAF distribution between cases and controls was observed for rs1051488, rs1131500, rs1050341, and rs1131285 (p > 0.05). Binary logistic regression analyses showed significant results for SNP rs2308655 (OR = 2.233, CI (95%) = 1.223-4.077, and p = 0.009), while no considerable association was observed for the other 4 SNPs. However, when adjusted for these variants, the association of rs2308655 further strengthened significantly (adjusted OR = 7.485, CI (95%) = 2.353-23.812, and p = 0.001), except for rs1131500, which has no additive effect. In conclusion, the finding of this study suggests rs2308655 variant in HLA-B gene as risk variant for T2D susceptibility in Pashtun population.
Collapse
Affiliation(s)
- Asif Jan
- Department of Pharmacy, University of Peshawar, Pakistan
| | - Muhammad Saeed
- Department of Pharmacy, University of Peshawar, Pakistan
| | | | - Fazli Khuda
- Department of Pharmacy, University of Peshawar, Pakistan
| | - Muhammad Shabbir
- Internal Medicine, College of Medicine, Shaqra University, Saudi Arabia
| | - Hamayun Khan
- Department of Pharmacy, University of Peshawar, Pakistan
| | - Sajid Ali
- Department of Biotechnology, Abdul Wali Khan University, Mardan, Pakistan
| | | | - Rani Akbar
- Department of Pharmacy, Abdul Wali Khan University, Mardan, Pakistan
| |
Collapse
|
5
|
Cohen JD, Diergaarde B, Papadopoulos N, Kinzler KW, Schoen RE. Tumor DNA as a Cancer Biomarker through the Lens of Colorectal Neoplasia. Cancer Epidemiol Biomarkers Prev 2020; 29:2441-2453. [PMID: 33033144 PMCID: PMC7710619 DOI: 10.1158/1055-9965.epi-20-0549] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2020] [Revised: 07/06/2020] [Accepted: 09/30/2020] [Indexed: 12/24/2022] Open
Abstract
Biomarkers have a wide range of applications in the clinical management of cancer, including screening and therapeutic management. Tumor DNA released from neoplastic cells has become a particularly active area of cancer biomarker development due to the critical role somatic alterations play in the pathophysiology of cancer and the ability to assess released tumor DNA in accessible clinical samples, in particular blood (i.e., liquid biopsy). Many of the early applications of tumor DNA as a biomarker were pioneered in colorectal cancer due to its well-defined genetics and common occurrence, the effectiveness of early detection, and the availability of effective therapeutic options. Herein, in the context of colorectal cancer, we describe how the intended clinical application dictates desired biomarker test performance, how features of tumor DNA provide unique challenges and opportunities for biomarker development, and conclude with specific examples of clinical application of tumor DNA as a biomarker with particular emphasis on early detection.See all articles in this CEBP Focus section, "NCI Early Detection Research Network: Making Cancer Detection Possible."
Collapse
Affiliation(s)
- Joshua D Cohen
- Ludwig Center for Cancer Genetics and Therapeutics, Johns Hopkins University School of Medicine, Baltimore, Maryland
- Sidney Kimmel Cancer Center, Johns Hopkins University School of Medicine, Baltimore, Maryland
- Sol Goldman Pancreatic Cancer Research Center, Johns Hopkins University School of Medicine, Baltimore, Maryland
- Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, Maryland
| | - Brenda Diergaarde
- Department of Human Genetics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, Pennsylvania
- UPMC Hillman Cancer Center, University of Pittsburgh, Pittsburgh, Pennsylvania
| | - Nickolas Papadopoulos
- Ludwig Center for Cancer Genetics and Therapeutics, Johns Hopkins University School of Medicine, Baltimore, Maryland
- Sidney Kimmel Cancer Center, Johns Hopkins University School of Medicine, Baltimore, Maryland
- Sol Goldman Pancreatic Cancer Research Center, Johns Hopkins University School of Medicine, Baltimore, Maryland
| | - Kenneth W Kinzler
- Ludwig Center for Cancer Genetics and Therapeutics, Johns Hopkins University School of Medicine, Baltimore, Maryland
- Sidney Kimmel Cancer Center, Johns Hopkins University School of Medicine, Baltimore, Maryland
- Sol Goldman Pancreatic Cancer Research Center, Johns Hopkins University School of Medicine, Baltimore, Maryland
| | - Robert E Schoen
- Department of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania.
- Department of Epidemiology, University of Pittsburgh, Pittsburgh, Pennsylvania
| |
Collapse
|
6
|
Zhang T, Pilko A, Wollman R. Loci specific epigenetic drug sensitivity. Nucleic Acids Res 2020; 48:4797-4810. [PMID: 32246716 PMCID: PMC7229858 DOI: 10.1093/nar/gkaa210] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2019] [Revised: 02/10/2020] [Accepted: 03/27/2020] [Indexed: 12/14/2022] Open
Abstract
Therapeutic targeting of epigenetic modulators offers a novel approach to the treatment of multiple diseases. The cellular consequences of chemical compounds that target epigenetic regulators (epi-drugs) are complex. Epi-drugs affect global cellular phenotypes and cause local changes to gene expression due to alteration of a gene chromatin environment. Despite increasing use in the clinic, the mechanisms responsible for cellular changes are unclear. Specifically, to what degree the effects are a result of cell-wide changes or disease related locus specific effects is unknown. Here we developed a platform to systematically and simultaneously investigate the sensitivity of epi-drugs at hundreds of genomic locations by combining DNA barcoding, unique split-pool encoding, and single cell expression measurements. Internal controls are used to isolate locus specific effects separately from any global consequences these drugs have. Using this platform we discovered wide-spread loci specific sensitivities to epi-drugs for three distinct epi-drugs that target histone deacetylase, DNA methylation and bromodomain proteins. By leveraging ENCODE data on chromatin modification, we identified features of chromatin environments that are most likely to be affected by epi-drugs. The measurements of loci specific epi-drugs sensitivities will pave the way to the development of targeted therapy for personalized medicine.
Collapse
Affiliation(s)
- Thanutra Zhang
- Institute for Quantitative and Computational Biosciences, University of California, Los Angeles, CA, USA
| | - Anna Pilko
- Institute for Quantitative and Computational Biosciences, University of California, Los Angeles, CA, USA
- Departments of Integrative Biology and Physiology and Chemistry and Biochemistry, University of California UCLA, CA, USA
| | - Roy Wollman
- Institute for Quantitative and Computational Biosciences, University of California, Los Angeles, CA, USA
- Departments of Integrative Biology and Physiology and Chemistry and Biochemistry, University of California UCLA, CA, USA
| |
Collapse
|
7
|
Khan S, Zhao X, Hou Y, Yuan C, Li Y, Luo X, Liu J, Feng X. Analysis of genome-wide SNPs based on 2b-RAD sequencing of pooled samples reveals signature of selection in different populations of Haemonchus contortus. J Biosci 2019; 44:97. [PMID: 31502575] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
The parasitic nematode Haemonchus contortus is one of the world's most important parasites of small ruminants that causes significant economic losses to the livestock sector. The population structure and selection in its various strains are poorly understood. No study so far compared its different populations using genome-wide data. Here, we focused on different geographic populations of H. contours from China (Tibet, TB; Hubei, HB; Inner Mongolia, IM; Sichuan, SC), UK and Australia (AS), using genome-wide population-genomic approaches, to explore genetic diversity, population structure and selection. We first performed next-generation high-throughput 2b RAD pool sequencing using Illumina technology, and identified single-nucleotide polymorphisms (SNPs) in all the strains. We identified 75,187 SNPs for TB, 82,271 for HB, 82,420 for IM, 79,803 for SC, 83,504 for AS and 78,747 for UK strain. The SNPs revealed low-nucleotide diversity (pi= 0.0092-0.0133) within each strain, and a significant differentiation level (average Fst = 0.34264) among them. Chinese populations TB and SC, along with the UK strain, were more divergent populations. Chinese populations IM and HB showed affinities to the Australian strain. We then analysed signature of selection and detected 44 (UK) and 03 (AS) private selective sweeps containing 49 and 05 genes, respectively. Finally, we performed the functional annotation of selective sweeps and proposed biological significance to signature of selection. Our data suggest that 2b-RAD pool sequencing can be used to assess the signature of selection in H. contortus.
Collapse
Affiliation(s)
- Sawar Khan
- Shanghai Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Key Laboratory of Animal Parasitology, Ministry of Agriculture of China, Shanghai 200241, People's Republic of China
| | | | | | | | | | | | | | | |
Collapse
|
8
|
Huang JK, Fan L, Wang TY, Wu PS. A new primer construction technique that effectively increases amplification of rare mutant templates in samples. BMC Biotechnol 2019; 19:62. [PMID: 31443709 PMCID: PMC6708177 DOI: 10.1186/s12896-019-0555-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2019] [Accepted: 08/15/2019] [Indexed: 02/07/2023] Open
Abstract
Background In personalized medicine, companion diagnostic tests provide additional information to help select a treatment option likely to be optimal for a patient. Although such tests include several techniques for detecting low levels of mutant genes in wild-type backgrounds with fairly high sensitivity, most tests are not specific, and may exhibit high false positive rates. In this study, we describe a new primer structure, named ‘stuntmer’, to selectively suppress amplification of wild-type templates, and promote amplification of mutant templates. Results A single stuntmer for a defined region of DNA can detect several kinds of mutations, including point mutations, deletions, and insertions. Stuntmer PCRs are also highly sensitive, being able to amplify mutant sequences that may make up as little as 0.1% of the DNA sample. Conclusion In conclusion, our technique, stuntmer PCR, can provide a simple, low-cost, highly sensitive, highly accurate, and highly specific platform for developing companion diagnostic tests. Electronic supplementary material The online version of this article (10.1186/s12896-019-0555-1) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Jr-Kai Huang
- Department of Pathology, Mackay Memorial Hospital, Taipei, Taiwan
| | - Ling Fan
- Department of Nuclear Medicine, Chang Gung Memorial Hospital, Taoyuan, Taiwan
| | - Tao-Yeuan Wang
- Department of Pathology, Mackay Memorial Hospital, Taipei, Taiwan
| | - Pao-Shu Wu
- Department of Pathology, Mackay Memorial Hospital, Taipei, Taiwan. .,Mackay Junior College of Medicine, Nursing, and Management, Taipei, Taiwan.
| |
Collapse
|
9
|
Khan S, Zhao X, Hou Y, Yuan C, Li Y, Luo X, Liu J, Feng X. Analysis of genome-wide SNPs based on 2b-RAD sequencing of pooled samples reveals signature of selection in different populations of Haemonchus contortus. J Biosci 2019. [DOI: 10.1007/s12038-019-9917-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
10
|
Zhang A, Li S, Apone L, Sun X, Chen L, Ettwiller LM, Langhorst BW, Noren CJ, Xu MQ. Solid-phase enzyme catalysis of DNA end repair and 3' A-tailing reduces GC-bias in next-generation sequencing of human genomic DNA. Sci Rep 2018; 8:15887. [PMID: 30367148 PMCID: PMC6203771 DOI: 10.1038/s41598-018-34079-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2018] [Accepted: 10/06/2018] [Indexed: 01/28/2023] Open
Abstract
The use of next-generation sequencing (NGS) has been instrumental in advancing biological research and clinical diagnostics. To fully utilize the power of NGS, complete, uniform coverage of the entire genome is required. In this study, we identified the primary sources of bias observed in sequence coverage across AT-rich regions of the human genome with existing amplification-free DNA library preparation methods. We have found evidence that a major source of bias is the inefficient processing of AT-rich DNA in end repair and 3' A-tailing, causing under-representation of extremely AT-rich regions. We have employed immobilized DNA modifying enzymes to catalyze end repair and 3' A-tailing reactions, to notably reduce the GC bias observed with existing library construction methods.
Collapse
Affiliation(s)
- Aihua Zhang
- New England Biolabs, Inc., 240 County Road, Ipswich, MA, 01938, USA
| | - Shaohua Li
- New England Biolabs, Inc., 240 County Road, Ipswich, MA, 01938, USA
| | - Lynne Apone
- New England Biolabs, Inc., 240 County Road, Ipswich, MA, 01938, USA
| | - Xiaoli Sun
- New England Biolabs, Inc., 240 County Road, Ipswich, MA, 01938, USA
| | - Lixin Chen
- New England Biolabs, Inc., 240 County Road, Ipswich, MA, 01938, USA
| | | | | | | | - Ming-Qun Xu
- New England Biolabs, Inc., 240 County Road, Ipswich, MA, 01938, USA.
| |
Collapse
|
11
|
A new approach based on targeted pooled DNA sequencing identifies novel mutations in patients with Inherited Retinal Dystrophies. Sci Rep 2018; 8:15457. [PMID: 30337596 PMCID: PMC6194132 DOI: 10.1038/s41598-018-33810-3] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2018] [Accepted: 10/04/2018] [Indexed: 01/28/2023] Open
Abstract
Inherited retinal diseases (IRD) are a heterogeneous group of diseases that mainly affect the retina; more than 250 genes have been linked to the disease and more than 20 different clinical phenotypes have been described. This heterogeneity both at the clinical and genetic levels complicates the identification of causative mutations. Therefore, a detailed genetic characterization is important for genetic counselling and decisions regarding treatment. In this study, we developed a method consisting on pooled targeted next generation sequencing (NGS) that we applied to 316 eye disease related genes, followed by High Resolution Melting and copy number variation analysis. DNA from 115 unrelated test samples was pooled and samples with known mutations were used as positive controls to assess the sensitivity of our approach. Causal mutations for IRDs were found in 36 patients achieving a detection rate of 31.3%. Overall, 49 likely causative mutations were identified in characterized patients, 14 of which were first described in this study (28.6%). Our study shows that this new approach is a cost-effective tool for detection of causative mutations in patients with inherited retinopathies.
Collapse
|
12
|
Uli N, Michelen-Gomez E, Ramos EI, Druley TE. Age-specific changes in genome-wide methylation enrich for Foxa2 and estrogen receptor alpha binding sites. PLoS One 2018; 13:e0203147. [PMID: 30256791 PMCID: PMC6157835 DOI: 10.1371/journal.pone.0203147] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2018] [Accepted: 08/15/2018] [Indexed: 12/26/2022] Open
Abstract
The role of DNA methylation patterns in complex phenotypes remains unclear. To explore this question, we adapted our methods for rare variant analysis to characterize genome-wide murine DNA hybridization array to investigate methylation at CpG islands, shores, and regulatory elements. We have applied this platform to compare age and tissue- specific methylation differences in the brain and spleen of young and aged mice. As expected from prior studies, there are clear global differences in organ-specific, but not age-specific, methylation due mostly to changes at repetitive elements. Surprisingly, out of 200,000 loci there were only 946 differentially methylated cytosines (DMCs) between young and old samples (529 hypermethylated, 417 hypomethylated in aged mice) compared to thousands of tissue-specific DMCs. Hypermethylated loci were clustered around the promoter region of Sfi1, exon 2 of Slc11a2, Drg1, Esr1 and Foxa2 transcription factor binding sites. In particular, there were 75 hypermethylated Foxa2 binding sites across a 2.7 Mb region of chromosome 11. Hypomethylated loci were clustered around Mid1, Isoc2b and genome-wide loci with binding sites for Foxa2 and Esr1, which are known to play important roles in development and aging. These data suggest discreet tissue-independent methylation changes associated with aging processes such as cell division (Sfi1, Mid1), energy production (Drg1, Isoc2b) and cell death (Foxa2, Esr1).
Collapse
Affiliation(s)
- Nishanth Uli
- Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Eduardo Michelen-Gomez
- Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Enrique I. Ramos
- Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Todd E. Druley
- Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri, United States of America
- Department of Pediatrics, Washington University School of Medicine, St. Louis, Missouri, United States of America
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri, United States of America
- Department of Developmental Biology, Washington University School of Medicine, St. Louis, Missouri, United States of America
- * E-mail:
| |
Collapse
|
13
|
Ryu S, Han J, Norden-Krichmar TM, Schork NJ, Suh Y. Effective discovery of rare variants by pooled target capture sequencing: A comparative analysis with individually indexed target capture sequencing. Mutat Res 2018; 809:24-31. [PMID: 29677560 PMCID: PMC5962423 DOI: 10.1016/j.mrfmmm.2018.03.007] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2018] [Accepted: 03/28/2018] [Indexed: 01/11/2023]
Abstract
Identification of all genetic variants associated with complex traits is one of the most important goals in modern human genetics. Genome-wide association studies (GWAS) have been successfully applied to identify common variants, which thus far explain only small portion of heritability. Interests in rare variants have been increasingly growing as an answer for this missing heritability. While next-generation sequencing allows detection of rare variants, its cost is still prohibitively high to sequence a large number of human DNA samples required for rare variant association studies. In this study, we evaluated the sensitivity and specificity of sequencing for pooled DNA samples of multiple individuals (Pool-seq) as a cost-effective and robust approach for rare variant discovery. We comparatively analyzed Pool-seq vs. individual-seq of indexed target capture of up to 960 genes in ∼1000 individuals, followed by independent genotyping validation studies. We found that Pool-seq was as effective and accurate as individual-seq in detecting rare variants and accurately estimating their minor allele frequencies (MAFs). Our results suggest that Pool-seq can be used as an efficient and cost-effective method for discovery of rare variants for population-based sequencing studies in individual laboratories.
Collapse
Affiliation(s)
- Seungjin Ryu
- Department of Genetics, Albert Einstein College of Medicine, Bronx, NY, 10461, USA
| | - Jeehae Han
- Department of Genetics, Albert Einstein College of Medicine, Bronx, NY, 10461, USA
| | | | - Nicholas J Schork
- The Scripps Research Institute, La Jolla, CA 92037, USA; J. Craig Venter Institute, La Jolla, CA, 92037, USA
| | - Yousin Suh
- Department of Genetics, Albert Einstein College of Medicine, Bronx, NY, 10461, USA; Department of Medicine, Albert Einstein College of Medicine, Bronx, NY, 10461, USA; Department of Ophthalmology and Visual Sciences, Albert Einstein College of Medicine, Bronx, NY, 10461, USA.
| |
Collapse
|
14
|
Worth JRP, Holland BR, Beeton NJ, Schönfeld B, Rossetto M, Vaillancourt RE, Jordan GJ. Habitat type and dispersal mode underlie the capacity for plant migration across an intermittent seaway. ANNALS OF BOTANY 2017; 120:539-549. [PMID: 28961707 PMCID: PMC5737502 DOI: 10.1093/aob/mcx086] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/22/2016] [Accepted: 06/06/2017] [Indexed: 06/07/2023]
Abstract
BACKGROUND AND AIMS Investigating species distributions across geographic barriers is a commonly utilized method in biogeography to help understand the functional traits that allow plants to disperse successfully. Here the biogeographic pattern analysis approach is extended by using chloroplast DNA whole-genome 'mining' to examine the functional traits that have impacted the dispersal of widespread temperate forest species across an intermittent seaway, the 200 km wide Bass Strait of south-eastern Australia. METHODS Multiple, co-distributed species of both dry and wet forests were sampled from five regions on either side of the Strait to obtain insights into past dispersal of these biomes via seed. Using a next-generation sequencing-based pool-seq method, the sharing of single nucleotide polymorphisms (SNPs) was estimated between all five regions in the chloroplast genome. KEY RESULTS A total of 3335 SNPs were detected in 20 species. SNP sharing patterns between regions provided evidence for significant seed-mediated gene flow across the study area, including across Bass Strait. A higher proportion of shared SNPs in dry forest species, especially those dispersed by birds, compared with wet forest species suggests that dry forest species have undergone greater seed-mediated gene flow across the study region during past climatic oscillations and sea level changes associated with the interglacial/glacial cycles. CONCLUSIONS This finding is consistent with a greater propensity for long-distance dispersal for species of open habitats and proxy evidence that expansive areas of dry vegetation occurred during times of exposure of Bass Strait during glacials. Overall, this study provides novel genetic evidence that habitat type and its interaction with dispersal traits are major influences on dispersal of plants.
Collapse
Affiliation(s)
- J R P Worth
- Forestry and Forest Products Research Institute, Matsunosato 1, Tsukuba, Ibaraki 305-8687, Japan
| | - B R Holland
- School of Physical Sciences, University of Tasmania, Private Bag 37, Hobart, Tasmania 7001, Australia
| | - N J Beeton
- School of Biological Sciences, University of Tasmania, Private Bag 55, Hobart, Tasmania 7001, Australia
| | - B Schönfeld
- School of Biological Sciences, University of Tasmania, Private Bag 55, Hobart, Tasmania 7001, Australia
| | - M Rossetto
- National Herbarium of New South Wales, Royal Botanic Gardens & Domain Trust, Mrs Macquaries Rd, Sydney, NSW, 2000, Australia
| | - R E Vaillancourt
- School of Biological Sciences, University of Tasmania, Private Bag 55, Hobart, Tasmania 7001, Australia
| | - G J Jordan
- School of Biological Sciences, University of Tasmania, Private Bag 55, Hobart, Tasmania 7001, Australia
| |
Collapse
|
15
|
Canzoniero JV, Cravero K, Park BH. The Impact of Collisions on the Ability to Detect Rare Mutant Alleles Using Barcode-Type Next-Generation Sequencing Techniques. Cancer Inform 2017. [DOI: 10.1177/1176935117719236] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Barcoding techniques are used to reduce error from next-generation sequencing, with applications ranging from understanding tumor subclone populations to detecting circulating tumor DNA. Collisions occur when more than one sample molecule is tagged by the same unique identifier (UID) and can result in failure to detect very-low-frequency mutations and error in estimating mutation frequency. Here, we created computer models of barcoding technique, with and without amplification bias introduced by the UID, and analyzed the effect of collisions for a range of mutant allele frequencies (1e−6 to 0.2), number of sample molecules (10 000 to 1e7), and number of UIDs (410-414). Inability to detect rare mutant alleles occurred in 0% to 100% of simulations, depending on collisions and number of mutant molecules. Collisions also introduced error in estimating mutant allele frequency resulting in underestimation of minor allele frequency. Incorporating an understanding of the effect of collisions into experimental design can allow for optimization of the number of sample molecules and number of UIDs to minimize the negative impact on rare mutant detection and mutant frequency estimation.
Collapse
Affiliation(s)
| | - Karen Cravero
- The Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins Medicine, Baltimore, MD, USA
| | - Ben Ho Park
- The Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins Medicine, Baltimore, MD, USA
| |
Collapse
|
16
|
Targeted sequencing of both DNA strands barcoded and captured individually by RNA probes to identify genome-wide ultra-rare mutations. Sci Rep 2017; 7:3356. [PMID: 28611392 PMCID: PMC5469810 DOI: 10.1038/s41598-017-03448-8] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2017] [Accepted: 05/05/2017] [Indexed: 12/13/2022] Open
Abstract
Next Generation Sequencing (NGS) has been widely implemented in biological research and has made a profound impact on patient care. One of the essential NGS applications is to identify disease-causing sequence variants, where high coverage and accuracy are needed. Here, we reported a novel NGS pipeline, termed a Sequencing System of Digitalized Barcode Encrypted Single-stranded Library from Extremely Low (quality and quantity) DNA Input with Probe-based DNA Enrichment by RNA probes targeting DNA duplex (DEEPER-Seq). This method combines an ultra-sensitive single-stranded library construction with barcoding error correction, termed DEEPER-Library; and a DNA capture approach using RNA probes targeting both DNA strands, termed DEEPER-Capture. DEEPER-Seq can create NGS libraries from as little as 20 pg DNA with PCR error correcting capabilities, and capture target sequences at an average ratio of 29.2% by targeting both DNA strands simultaneously with an over 98.6% coverage. Our method tags and sequences each of the two strands of a DNA duplex independently and only scores mutations that are found at the same position in both strands, which allows us to identify mutations with allelic fractions down to 0.03% in a whole exome sequencing (WES) study with a background error rate of one artificial error per 4.8 × 109 nucleotides.
Collapse
|
17
|
Choi W, Jung GY. Highly multiplex and sensitive SNP genotyping method using a three-color fluorescence-labeled ligase detection reaction coupled with conformation-sensitive CE. Electrophoresis 2016; 38:513-520. [DOI: 10.1002/elps.201600369] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2016] [Revised: 09/19/2016] [Accepted: 10/12/2016] [Indexed: 11/06/2022]
Affiliation(s)
- Woong Choi
- School of Interdisciplinary Bioscience and Bioengineering; Pohang University of Science and Technology; Pohang Gyeongbuk Korea
| | - Gyoo Yeol Jung
- School of Interdisciplinary Bioscience and Bioengineering; Pohang University of Science and Technology; Pohang Gyeongbuk Korea
- Department of Chemical Engineering; Pohang University of Science and Technology; Pohang Gyeongbuk Korea
| |
Collapse
|
18
|
Chen YJ, Wambach JA, DePass K, Wegner DJ, Chen SK, Zhang QY, Heins H, Cole FS, Hamvas A. Population-based frequency of surfactant dysfunction mutations in a native Chinese cohort. World J Pediatr 2016; 12:190-5. [PMID: 26547207 DOI: 10.1007/s12519-015-0047-x] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/19/2015] [Accepted: 07/14/2015] [Indexed: 12/30/2022]
Abstract
BACKGROUND Rare mutations in surfactant-associated genes contribute to neonatal respiratory distress syndrome. The frequency of mutations in these genes in the Chinese population is unknown. METHODS We obtained blood spots from the Guangxi Neonatal Screening Center in Nanning, China that included Han (n=443) and Zhuang (n=313) ethnic groups. We resequenced all exons of the surfactant proteins-B (SFTPB), -C (SFTPC), and the ATP-binding cassette member A3 (ABCA3) genes and compared the frequencies of 5 common and all rare variants. RESULTS We found minor differences in the frequencies of the common variants in the Han and Zhuang cohorts. We did not find any rare mutations in SFTPB or SFTPC, but we found three ABCA3 mutations in the Han [minor allele frequency (MAF)=0.003] and 7 in the Zhuang (MAF=0.011) cohorts (P=0.10). The ABCA3 mutations were unique to each cohort; five were novel. The collapsed carrier rate of rare ABCA3 mutations in the Han and Zhuang populations combined was 1.3%, which is significantly lower than that in the United States (P<0.001). CONCLUSION The population-based frequency of mutations in ABCA3 in south China newborns is significantly lower than that in United States. The contribution of these rare ABCA3 mutations to disease burden in the south China population is still unknown.
Collapse
Affiliation(s)
- Yu-Jun Chen
- Division of Neonatology, Department of Pediatrics, the First Affiliated Hospital of Guangxi Medical University, Nanning, China.,Division of Newborn Medicine, Edward Mallinckrodt Department of Pediatrics, Washington University School of Medicine, St. Louis, USA
| | - Jennifer Anne Wambach
- Division of Newborn Medicine, Edward Mallinckrodt Department of Pediatrics, Washington University School of Medicine, St. Louis, USA
| | - Kelcey DePass
- Division of Newborn Medicine, Edward Mallinckrodt Department of Pediatrics, Washington University School of Medicine, St. Louis, USA
| | - Daniel James Wegner
- Division of Newborn Medicine, Edward Mallinckrodt Department of Pediatrics, Washington University School of Medicine, St. Louis, USA
| | - Shao-Ke Chen
- Department of Pediatrics, Guangxi Maternal and Child Health Hospital, Nanning, China
| | - Qun-Yuan Zhang
- Division of Statistical Genomics, Department of Genetics, Washington University School of Medicine, St. Louis, USA
| | - Hillary Heins
- Division of Newborn Medicine, Edward Mallinckrodt Department of Pediatrics, Washington University School of Medicine, St. Louis, USA
| | - Francis Sessions Cole
- Division of Newborn Medicine, Edward Mallinckrodt Department of Pediatrics, Washington University School of Medicine, St. Louis, USA
| | - Aaron Hamvas
- Division of Newborn Medicine, Edward Mallinckrodt Department of Pediatrics, Washington University School of Medicine, St. Louis, USA. .,Division of Neonatology, Ann and Robert H. Lurie Children's Hospital, 225 E. Chicago Ave, Box No. 45, Chicago, IL, 60611, USA.
| |
Collapse
|
19
|
Postula M, Janicki PK, Eyileten C, Rosiak M, Kaplon-Cieslicka A, Sugino S, Wilimski R, Kosior DA, Opolski G, Filipiak KJ, Mirowska-Guzel D. Next-generation re-sequencing of genes involved in increased platelet reactivity in diabetic patients on acetylsalicylic acid. Platelets 2015; 27:357-64. [PMID: 26599574 DOI: 10.3109/09537104.2015.1109071] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
The objective of this study was to investigate whether rare missense genetic variants in several genes related to platelet functions and acetylsalicylic acid (ASA) response are associated with the platelet reactivity in patients with diabetes type 2 (T2D) on ASA therapy. Fifty eight exons and corresponding introns of eight selected genes, including PTGS1, PTGS2, TXBAS1, PTGIS, ADRA2A, ADRA2B, TXBA2R, and P2RY1 were re-sequenced in 230 DNA samples from T2D patients by using a pooled PCR amplification and next-generation sequencing by Illumina HiSeq2000. The observed non-synonymous variants were confirmed by individual genotyping of 384 DNA samples comprising of the individuals from the original discovery pools and additional verification cohort of 154 ASA-treated T2DM patients. The association between investigated phenotypes (ASA induced changes in platelets reactivity by PFA-100, VerifyNow and serum thromboxane B2 level [sTxB2]), and accumulation of rare missense variants (genetic burden) in investigated genes was tested using statistical collapsing tests. We identified a total of 35 exonic variants, including 3 common missense variants, 15 rare missense variants, and 17 synonymous variants in 8 investigated genes. The rare missense variants exhibited statistically significant difference in the accumulation pattern between a group of patients with increased and normal platelet reactivity based on PFA-100 assay. Our study suggests that genetic burden of the rare functional variants in eight genes may contribute to differences in the platelet reactivity measured with the PFA-100 assay in the T2DM patients treated with ASA.
Collapse
Affiliation(s)
- Marek Postula
- a Department of Experimental and Clinical Pharmacology , Medical University of Warsaw, Center for Preclinical Research and Technology CEPT , Warsaw , Poland.,b Perioperative Genomics Laboratory , Penn State College of Medicine , Hershey , PA , USA
| | - Piotr K Janicki
- b Perioperative Genomics Laboratory , Penn State College of Medicine , Hershey , PA , USA
| | - Ceren Eyileten
- a Department of Experimental and Clinical Pharmacology , Medical University of Warsaw, Center for Preclinical Research and Technology CEPT , Warsaw , Poland
| | - Marek Rosiak
- a Department of Experimental and Clinical Pharmacology , Medical University of Warsaw, Center for Preclinical Research and Technology CEPT , Warsaw , Poland.,c Department of Cardiology and Hypertension , Central Clinical Hospital, The Ministry of the Interior , Warsaw , Poland
| | | | - Shigekazu Sugino
- b Perioperative Genomics Laboratory , Penn State College of Medicine , Hershey , PA , USA
| | - Radosław Wilimski
- e Department of Cardiac Surgery , Medical University of Warsaw , Warsaw , Poland
| | - Dariusz A Kosior
- c Department of Cardiology and Hypertension , Central Clinical Hospital, The Ministry of the Interior , Warsaw , Poland.,f Department of Applied Physiology , Mossakowski Medical Research Centre, Polish Academy of Sciences , Warsaw , Poland
| | - Grzegorz Opolski
- d Department of Cardiology , Medical University of Warsaw , Warsaw , Poland
| | | | - Dagmara Mirowska-Guzel
- a Department of Experimental and Clinical Pharmacology , Medical University of Warsaw, Center for Preclinical Research and Technology CEPT , Warsaw , Poland
| |
Collapse
|
20
|
Torgerson DG, Giri T, Druley TE, Zheng J, Huntsman S, Seibold MA, Young AL, Schweiger T, Yin-Declue H, Sajol GD, Schechtman KB, Hernandez RD, Randolph AG, Bacharier LB, Castro M. Pooled Sequencing of Candidate Genes Implicates Rare Variants in the Development of Asthma Following Severe RSV Bronchiolitis in Infancy. PLoS One 2015; 10:e0142649. [PMID: 26587832 PMCID: PMC4654486 DOI: 10.1371/journal.pone.0142649] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2014] [Accepted: 02/06/2015] [Indexed: 12/17/2022] Open
Abstract
Severe infection with respiratory syncytial virus (RSV) during infancy is strongly associated with the development of asthma. To identify genetic variation that contributes to asthma following severe RSV bronchiolitis during infancy, we sequenced the coding exons of 131 asthma candidate genes in 182 European and African American children with severe RSV bronchiolitis in infancy using anonymous pools for variant discovery, and then directly genotyped a set of 190 nonsynonymous variants. Association testing was performed for physician-diagnosed asthma before the 7th birthday (asthma) using genotypes from 6,500 individuals from the Exome Sequencing Project (ESP) as controls to gain statistical power. In addition, among patients with severe RSV bronchiolitis during infancy, we examined genetic associations with asthma, active asthma, persistent wheeze, and bronchial hyperreactivity (methacholine PC20) at age 6 years. We identified four rare nonsynonymous variants that were significantly associated with asthma following severe RSV bronchiolitis, including single variants in ADRB2, FLG and NCAM1 in European Americans (p = 4.6x10-4, 1.9x10-13 and 5.0x10-5, respectively), and NOS1 in African Americans (p = 2.3x10-11). One of the variants was a highly functional nonsynonymous variant in ADRB2 (rs1800888), which was also nominally associated with asthma (p = 0.027) and active asthma (p = 0.013) among European Americans with severe RSV bronchiolitis without including the ESP. Our results suggest that rare nonsynonymous variants contribute to the development of asthma following severe RSV bronchiolitis in infancy, notably in ADRB2. Additional studies are required to explore the role of rare variants in the etiology of asthma and asthma-related traits following severe RSV bronchiolitis.
Collapse
Affiliation(s)
- Dara G. Torgerson
- Department of Medicine, University of California San Francisco, San Francisco, California, United States of America
| | - Tusar Giri
- Department of Internal Medicine, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Todd E. Druley
- Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Jie Zheng
- Department of Biostatistics, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Scott Huntsman
- Department of Medicine, University of California San Francisco, San Francisco, California, United States of America
| | - Max A. Seibold
- Integrated Center for Genes, Environment and Health, National Jewish Health, Denver, Colorado, United States of America
| | - Andrew L. Young
- Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Toni Schweiger
- Department of Internal Medicine, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Huiqing Yin-Declue
- Department of Internal Medicine, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Geneline D. Sajol
- Department of Internal Medicine, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Kenneth B Schechtman
- Department of Biostatistics, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Ryan D. Hernandez
- Department of Bioengineering and Therapeutic Sciences, Institute of Human Genetics, and California Institute of Quantitative Biosciences (QB3), University of California San Francisco, San Francisco, California, United States of America
| | - Adrienne G. Randolph
- Department of Anesthesiology, Boston Children’s Hospital, Boston, Massachusetts, United States of America
| | - Leonard B. Bacharier
- Department of Pediatrics, Washington University School of Medicine and St. Louis Children’s Hospital, St. Louis, Missouri, United States of America
| | - Mario Castro
- Department of Internal Medicine, Washington University School of Medicine, St. Louis, Missouri, United States of America
- * E-mail:
| |
Collapse
|
21
|
Gregory MT, Bertout JA, Ericson NG, Taylor SD, Mukherjee R, Robins HS, Drescher CW, Bielas JH. Targeted single molecule mutation detection with massively parallel sequencing. Nucleic Acids Res 2015; 44:e22. [PMID: 26384417 PMCID: PMC4756847 DOI: 10.1093/nar/gkv915] [Citation(s) in RCA: 55] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2015] [Accepted: 09/02/2015] [Indexed: 11/14/2022] Open
Abstract
Next-generation sequencing (NGS) technologies have transformed genomic research and have the potential to revolutionize clinical medicine. However, the background error rates of sequencing instruments and limitations in targeted read coverage have precluded the detection of rare DNA sequence variants by NGS. Here we describe a method, termed CypherSeq, which combines double-stranded barcoding error correction and rolling circle amplification (RCA)-based target enrichment to vastly improve NGS-based rare variant detection. The CypherSeq methodology involves the ligation of sample DNA into circular vectors, which contain double-stranded barcodes for computational error correction and adapters for library preparation and sequencing. CypherSeq is capable of detecting rare mutations genome-wide as well as those within specific target genes via RCA-based enrichment. We demonstrate that CypherSeq is capable of correcting errors incurred during library preparation and sequencing to reproducibly detect mutations down to a frequency of 2.4 × 10−7 per base pair, and report the frequency and spectra of spontaneous and ethyl methanesulfonate-induced mutations across the Saccharomycescerevisiae genome.
Collapse
Affiliation(s)
- Mark T Gregory
- Translational Research Program, Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
| | - Jessica A Bertout
- Translational Research Program, Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
| | - Nolan G Ericson
- Translational Research Program, Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
| | - Sean D Taylor
- Translational Research Program, Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
| | - Rithun Mukherjee
- Computational Biology Program, Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
| | - Harlan S Robins
- Computational Biology Program, Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA Human Biology Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
| | - Charles W Drescher
- Translational Research Program, Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
| | - Jason H Bielas
- Translational Research Program, Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA Human Biology Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA Department of Pathology, University of Washington, Seattle, WA 98195, USA
| |
Collapse
|
22
|
Silva-Junior OB, Faria DA, Grattapaglia D. A flexible multi-species genome-wide 60K SNP chip developed from pooled resequencing of 240 Eucalyptus tree genomes across 12 species. THE NEW PHYTOLOGIST 2015; 206:1527-40. [PMID: 25684350 DOI: 10.1111/nph.13322] [Citation(s) in RCA: 70] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/22/2014] [Accepted: 01/02/2015] [Indexed: 05/23/2023]
Abstract
We used whole genome resequencing of pooled individuals to develop a high-density single-nucleotide polymorphism (SNP) chip for Eucalyptus. Genomes of 240 trees of 12 species were sequenced at 3.5× each, and 46 997 586 raw SNP variants were subject to multivariable filtering metrics toward a multispecies, genome-wide distributed chip content. Of the 60 904 SNPs on the chip, 59 222 were genotyped and 51 204 were polymorphic across 14 Eucalyptus species, providing a 96% genome-wide coverage with 1 SNP/12-20 kb, and 47 069 SNPs at ≤ 10 kb from 30 444 of the 33 917 genes in the Eucalyptus genome. Given the EUChip60K multi-species genotyping flexibility, we show that both the sample size and taxonomic composition of cluster files impact heterozygous call specificity and sensitivity by benchmarking against 'gold standard' genotypes derived from deeply sequenced individual tree genomes. Thousands of SNPs were shared across species, likely representing ancient variants arisen before the split of these taxa, hinting to a recent eucalypt radiation. We show that the variable SNP filtering constraints allowed coverage of the entire site frequency spectrum, mitigating SNP ascertainment bias. The EUChip60K represents an outstanding tool with which to address population genomics questions in Eucalyptus and to empower genomic selection, GWAS and the broader study of complex trait variation in eucalypts.
Collapse
Affiliation(s)
- Orzenil B Silva-Junior
- Laboratório de Bioinformática, EMBRAPA Recursos Genéticos e Biotecnologia, PqEB, 70770-970, Brasilia, DF, Brazil
- Programa de Ciências Genômicas e Biotecnologia, Universidade Católica de Brasília, SGAN 916, 70790-160, Brasilia, DF, Brazil
| | - Danielle A Faria
- Laboratório de Genética Vegetal, EMBRAPA Recursos Genéticos e Biotecnologia, PqEB, 70770-970, Brasilia, DF, Brazil
| | - Dario Grattapaglia
- Programa de Ciências Genômicas e Biotecnologia, Universidade Católica de Brasília, SGAN 916, 70790-160, Brasilia, DF, Brazil
- Laboratório de Genética Vegetal, EMBRAPA Recursos Genéticos e Biotecnologia, PqEB, 70770-970, Brasilia, DF, Brazil
| |
Collapse
|
23
|
Cady J, Allred P, Bali T, Pestronk A, Goate A, Miller TM, Mitra R, Ravits J, Harms MB, Baloh RH. Amyotrophic lateral sclerosis onset is influenced by the burden of rare variants in known amyotrophic lateral sclerosis genes. Ann Neurol 2015; 77:100-13. [PMID: 25382069 PMCID: PMC4293318 DOI: 10.1002/ana.24306] [Citation(s) in RCA: 165] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2014] [Revised: 10/16/2014] [Accepted: 11/02/2014] [Indexed: 12/11/2022]
Abstract
OBJECTIVE To define the genetic landscape of amyotrophic lateral sclerosis (ALS) and assess the contribution of possible oligogenic inheritance, we aimed to comprehensively sequence 17 known ALS genes in 391 ALS patients from the United States. METHODS Targeted pooled-sample sequencing was used to identify variants in 17 ALS genes. Fragment size analysis was used to define ATXN2 and C9ORF72 expansion sizes. Genotype-phenotype correlations were made with individual variants and total burden of variants. Rare variant associations for risk of ALS were investigated at both the single variant and gene level. RESULTS A total of 64.3% of familial and 27.8% of sporadic subjects carried potentially pathogenic novel or rare coding variants identified by sequencing or an expanded repeat in C9ORF72 or ATXN2; 3.8% of subjects had variants in >1 ALS gene, and these individuals had disease onset 10 years earlier (p = 0.0046) than subjects with variants in a single gene. The number of potentially pathogenic coding variants did not influence disease duration or site of onset. INTERPRETATION Rare and potentially pathogenic variants in known ALS genes are present in >25% of apparently sporadic and 64% of familial patients, significantly higher than previous reports using less comprehensive sequencing approaches. A significant number of subjects carried variants in >1 gene, which influenced the age of symptom onset and supports oligogenic inheritance as relevant to disease pathogenesis.
Collapse
Affiliation(s)
- Janet Cady
- Department of Neurology; Washington University. St. Louis, MO, USA
| | - Peggy Allred
- Department of Neurology; Cedars Sinai Medical Center. Los Angeles, CA, USA
| | - Taha Bali
- Department of Neurology; Washington University. St. Louis, MO, USA
| | - Alan Pestronk
- Department of Neurology; Washington University. St. Louis, MO, USA
| | - Alison Goate
- Department of Neurology; Washington University. St. Louis, MO, USA
- Department of Psychiatry; Washington University. St. Louis, MO, USA
- Hope Center for Neurological Disorders; Washington University. St. Louis, MO, USA
| | - Timothy M. Miller
- Department of Neurology; Washington University. St. Louis, MO, USA
- Hope Center for Neurological Disorders; Washington University. St. Louis, MO, USA
| | - Rob Mitra
- Department of Genetics; Washington University. St. Louis, MO, USA
| | - John Ravits
- Department of Neurosciences; University of California, San Diego. La Jolla, CA, USA
| | - Matthew B. Harms
- Department of Neurology; Washington University. St. Louis, MO, USA
- Hope Center for Neurological Disorders; Washington University. St. Louis, MO, USA
| | - Robert H. Baloh
- Department of Neurology; Cedars Sinai Medical Center. Los Angeles, CA, USA
| |
Collapse
|
24
|
Coghlan MA, Shifren A, Huang HJ, Russell TD, Mitra RD, Zhang Q, Wegner DJ, Cole FS, Hamvas A. Sequencing of idiopathic pulmonary fibrosis-related genes reveals independent single gene associations. BMJ Open Respir Res 2014; 1:e000057. [PMID: 25553246 PMCID: PMC4265083 DOI: 10.1136/bmjresp-2014-000057] [Citation(s) in RCA: 55] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2014] [Revised: 10/31/2014] [Accepted: 11/03/2014] [Indexed: 11/21/2022] Open
Abstract
Background Previous studies investigating a genetic basis for idiopathic pulmonary fibrosis (IPF) have focused on resequencing single genes in IPF kindreds or cohorts to determine the genetic contributions to IPF. None has investigated interactions among the candidate genes. Objective To compare the frequencies and interactions of mutations in six IPF-associated genes in a cohort of 132 individuals with IPF with those of a disease-control cohort of 192 individuals with chronic obstructive pulmonary disease (COPD) and the population represented in the Exome Variant Server. Methods We resequenced the genes encoding surfactant proteins A2 (SFTPA2), and C (SFTPC), the ATP binding cassette member A3 (ABCA3), telomerase (TERT), thyroid transcription factor (NKX2-1) and mucin 5B (MUC5B) and compared the collapsed frequencies of rare (minor allele frequency <1%), computationally predicted deleterious variants in each cohort. We also genotyped a common MUC5B promoter variant that is over-represented in individuals with IPF. Results We found 15 mutations in 14 individuals (11%) in the IPF cohort: (SFTPA2 (n=1), SFTPC (n=5), ABCA3 (n=4) and TERT (n=5)). No individual with IPF had two different mutations, but one individual with IPF was homozygous for p.E292V, the most common ABCA3 disease-causing variant. We did not detect an interaction between any of the mutations and the MUC5B promoter variant. Conclusions Rare mutations in SFTPA2, SFTPC and TERT are collectively over-represented in individuals with IPF. Genetic analysis and counselling should be considered as part of the IPF evaluation.
Collapse
Affiliation(s)
- Meghan A Coghlan
- Division of Newborn Medicine, Edward Mallinckrodt Department of Pediatrics , Washington University School of Medicine , St. Louis, Missouri , USA
| | - Adrian Shifren
- Division of Pulmonary and Critical Care Medicine, Department of Internal Medicine , Washington University School of Medicine , St. Louis, Missouri , USA
| | - Howard J Huang
- Division of Pulmonary and Critical Care Medicine, Department of Internal Medicine , Washington University School of Medicine , St. Louis, Missouri , USA
| | - Tonya D Russell
- Division of Pulmonary and Critical Care Medicine, Department of Internal Medicine , Washington University School of Medicine , St. Louis, Missouri , USA
| | - Robi D Mitra
- Department of Genetics , Center for Genome Sciences and Systems Biology, Washington University School of Medicine , St. Louis, Missouri , USA
| | - Qunyuan Zhang
- Division of Statistical Genomics, Department of Genetics , Washington University School of Medicine , St. Louis, Missouri , USA
| | - Daniel J Wegner
- Division of Newborn Medicine, Edward Mallinckrodt Department of Pediatrics , Washington University School of Medicine , St. Louis, Missouri , USA
| | - F Sessions Cole
- Division of Newborn Medicine, Edward Mallinckrodt Department of Pediatrics , Washington University School of Medicine , St. Louis, Missouri , USA
| | - Aaron Hamvas
- Division of Newborn Medicine, Edward Mallinckrodt Department of Pediatrics , Washington University School of Medicine , St. Louis, Missouri , USA ; Division of Neonatology, Department of Pediatrics , Ann and Robert H. Lurie Children's Hospital of Chicago, Northwestern University Feinberg School of Medicine , Chicago, Illinois , USA
| |
Collapse
|
25
|
Sequencing pools of individuals — mining genome-wide polymorphism data without big funding. Nat Rev Genet 2014; 15:749-63. [DOI: 10.1038/nrg3803] [Citation(s) in RCA: 512] [Impact Index Per Article: 46.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
|
26
|
Synonymous ABCA3 variants do not increase risk for neonatal respiratory distress syndrome. J Pediatr 2014; 164:1316-21.e3. [PMID: 24657120 PMCID: PMC4035386 DOI: 10.1016/j.jpeds.2014.02.021] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/03/2013] [Revised: 12/23/2013] [Accepted: 02/06/2014] [Indexed: 12/25/2022]
Abstract
OBJECTIVE To determine whether synonymous variants in the adenosine triphosphate-binding cassette A3 transporter (ABCA3) gene increase the risk for neonatal respiratory distress syndrome (RDS) in term and late preterm infants of European and African descent. STUDY DESIGN Using next-generation pooled sequencing of race-stratified DNA samples from infants of European and African descent at ≥34 weeks gestation with and without RDS (n = 503), we scanned all exons of ABCA3, validated each synonymous variant with an independent genotyping platform, and evaluated race-stratified disease risk associated with common synonymous variants and collapsed frequencies of rare synonymous variants. RESULTS The synonymous ABCA3 variant frequency spectrum differs between infants of European descent and those of African descent. Using in silico prediction programs and statistical strategies, we found no potentially disruptive synonymous ABCA3 variants or evidence of selection pressure. Individual common synonymous variants and collapsed frequencies of rare synonymous variants did not increase disease risk in term and late-preterm infants of European or African descent. CONCLUSION In contrast to rare, nonsynonymous ABCA3 mutations, synonymous ABCA3 variants do not increase the risk for neonatal RDS among term and late-preterm infants of European or African descent.
Collapse
|
27
|
Fanjul-Fernández M, Quesada V, Cabanillas R, Cadiñanos J, Fontanil T, Obaya A, Ramsay AJ, Llorente JL, Astudillo A, Cal S, López-Otín C. Cell-cell adhesion genes CTNNA2 and CTNNA3 are tumour suppressors frequently mutated in laryngeal carcinomas. Nat Commun 2014; 4:2531. [PMID: 24100690 DOI: 10.1038/ncomms3531] [Citation(s) in RCA: 64] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2013] [Accepted: 09/02/2013] [Indexed: 12/23/2022] Open
Abstract
Laryngeal squamous cell carcinoma is a frequent and significant cause of morbidity and mortality. Here we explore the biological basis of this aggressive tumour, and identify two cell-cell adhesion genes as recurrently mutated in this malignancy. We first perform exome sequencing of four laryngeal carcinomas and their matched normal tissues. Among the 569 genes found to present somatic mutations, and based on their recurrence or functional relevance in cancer, we select 40 for further validation in 86 additional laryngeal carcinomas. We detect frequent mutations (14 of 90, 15%) in CTNNA2 and CTNNA3-encoding α-catenins. Functional studies reveal an increase in the migration and invasive ability of head and neck squamous cell carcinoma cells producing mutated forms of CTNNA2 and CTNNA3 or in cells where both α-catenins are silenced. Analysis of the clinical relevance of these mutations demonstrates that they are associated with poor prognosis. We conclude that CTNNA2 and CTNNA3 are tumour suppressor genes frequently mutated in laryngeal carcinomas.
Collapse
Affiliation(s)
- Miriam Fanjul-Fernández
- 1] Departamento de Bioquímica y Biología Molecular, Instituto Universitario de Oncología (IUOPA), Universidad de Oviedo, Oviedo 33006, Spain [2]
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
28
|
Anasagasti A, Barandika O, Irigoyen C, Benitez BA, Cooper B, Cruchaga C, López de Munain A, Ruiz-Ederra J. Genetic high throughput screening in Retinitis Pigmentosa based on high resolution melting (HRM) analysis. Exp Eye Res 2014; 116:386-394. [PMID: 24416769 DOI: 10.1016/j.exer.2013.10.011] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Retinitis Pigmentosa (RP) involves a group of genetically determined retinal diseases caused by a large number of mutations that result in rod photoreceptor cell death followed by gradual death of cone cells. Most cases of RP are monogenic, with more than 80 associated genes identified so far. The high number of genes and variants involved in RP, among other factors, is making the molecular characterization of RP a real challenge for many patients. Although HRM has been used for the analysis of isolated variants or single RP genes, as far as we are concerned, this is the first study that uses HRM analysis for a high-throughput screening of several RP genes. Our main goal was to test the suitability of HRM analysis as a genetic screening technique in RP, and to compare its performance with two of the most widely used NGS platforms, Illumina and PGM-Ion Torrent technologies. RP patients (n = 96) were clinically diagnosed at the Ophthalmology Department of Donostia University Hospital, Spain. We analyzed a total of 16 RP genes that meet the following inclusion criteria: 1) size: genes with transcripts of less than 4 kb; 2) number of exons: genes with up to 22 exons; and 3) prevalence: genes reported to account for, at least, 0.4% of total RP cases worldwide. For comparison purposes, RHO gene was also sequenced with Illumina (GAII; Illumina), Ion semiconductor technologies (PGM; Life Technologies) and Sanger sequencing (ABI 3130xl platform; Applied Biosystems). Detected variants were confirmed in all cases by Sanger sequencing and tested for co-segregation in the family of affected probands. We identified a total of 65 genetic variants, 15 of which (23%) were novel, in 49 out of 96 patients. Among them, 14 (4 novel) are probable disease-causing genetic variants in 7 RP genes, affecting 15 patients. Our HRM analysis-based study, proved to be a cost-effective and rapid method that provides an accurate identification of genetic RP variants. This approach is effective for medium sized (<4 kb transcript) RP genes, which constitute over 80% of the total of known RP genes.
Collapse
Affiliation(s)
- Ander Anasagasti
- Department of Neuroscience, Instituto Biodonostia, Paseo Dr. Begiristain s/n, E-20014 San Sebastián, Spain
| | - Olatz Barandika
- Department of Neuroscience, Instituto Biodonostia, Paseo Dr. Begiristain s/n, E-20014 San Sebastián, Spain
| | - Cristina Irigoyen
- Department of Ophthalmology, Hospital Universitario Donostia, San Sebastián, Spain
| | - Bruno A Benitez
- Department of Psychiatry, Washington University, St. Louis, MO, USA
| | - Breanna Cooper
- Department of Psychiatry, Washington University, St. Louis, MO, USA
| | - Carlos Cruchaga
- Department of Psychiatry, Washington University, St. Louis, MO, USA; Hope Center Program on Protein Aggregation and Neurodegeneration, Washington University, St. Louis, MO, USA
| | - Adolfo López de Munain
- Department of Neuroscience, Instituto Biodonostia, Paseo Dr. Begiristain s/n, E-20014 San Sebastián, Spain; Department of Neurology, Hospital Universitario Donostia, San Sebastián, Spain; CIBERNED, Centro de Investigaciones Biomédicas en Red sobre Enfermedades Neurodegenerativas, Instituto Carlos III, Ministerio de Economía y Competitividad, Spain; Department of Neurosciences, University of the Basque Country UPV-EHU, Spain; Euskampus, University of the Basque Country UPV-EHU, Spain
| | - Javier Ruiz-Ederra
- Department of Neuroscience, Instituto Biodonostia, Paseo Dr. Begiristain s/n, E-20014 San Sebastián, Spain.
| |
Collapse
|
29
|
Shapter FM, Cross M, Ablett G, Malory S, Chivers IH, King GJ, Henry RJ. High-throughput sequencing and mutagenesis to accelerate the domestication of Microlaena stipoides as a new food crop. PLoS One 2013; 8:e82641. [PMID: 24367532 PMCID: PMC3867367 DOI: 10.1371/journal.pone.0082641] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2013] [Accepted: 10/26/2013] [Indexed: 12/21/2022] Open
Abstract
Global food demand, climatic variability and reduced land availability are driving the need for domestication of new crop species. The accelerated domestication of a rice-like Australian dryland polyploid grass, Microlaena stipoides (Poaceae), was targeted using chemical mutagenesis in conjunction with high throughput sequencing of genes for key domestication traits. While M. stipoides has previously been identified as having potential as a new grain crop for human consumption, only a limited understanding of its genetic diversity and breeding system was available to aid the domestication process. Next generation sequencing of deeply-pooled target amplicons estimated allelic diversity of a selected base population at 14.3 SNP/Mb and identified novel, putatively mutation-induced polymorphisms at about 2.4 mutations/Mb. A 97% lethal dose (LD₉₇) of ethyl methanesulfonate treatment was applied without inducing sterility in this polyploid species. Forward and reverse genetic screens identified beneficial alleles for the domestication trait, seed-shattering. Unique phenotypes observed in the M2 population suggest the potential for rapid accumulation of beneficial traits without recourse to a traditional cross-breeding strategy. This approach may be applicable to other wild species, unlocking their potential as new food, fibre and fuel crops.
Collapse
Affiliation(s)
- Frances M. Shapter
- Southern Cross Plant Science, Southern Cross University, Lismore, New South Wales, Australia
- * E-mail:
| | - Michael Cross
- Southern Cross Plant Science, Southern Cross University, Lismore, New South Wales, Australia
| | - Gary Ablett
- Southern Cross Plant Science, Southern Cross University, Lismore, New South Wales, Australia
| | - Sylvia Malory
- Southern Cross Plant Science, Southern Cross University, Lismore, New South Wales, Australia
| | - Ian H. Chivers
- Southern Cross Plant Science, Southern Cross University, Lismore, New South Wales, Australia
- Native Seeds Pty Ltd, Sandringham, Victoria, Australia
| | - Graham J. King
- Southern Cross Plant Science, Southern Cross University, Lismore, New South Wales, Australia
| | - Robert J. Henry
- Queensland Alliance for Agriculture and Food Innovation, University of Queensland, Brisbane, Queensland, Australia
| |
Collapse
|
30
|
Zhao Z, Wang W, Wei Z. An empirical Bayes testing procedure for detecting variants in analysis of next generation sequencing data. Ann Appl Stat 2013. [DOI: 10.1214/13-aoas660] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
31
|
Konczal M, Koteja P, Stuglik MT, Radwan J, Babik W. Accuracy of allele frequency estimation using pooled RNA-Seq. Mol Ecol Resour 2013; 14:381-92. [PMID: 24119300 DOI: 10.1111/1755-0998.12186] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2013] [Revised: 09/30/2013] [Accepted: 10/06/2013] [Indexed: 11/28/2022]
Abstract
For nonmodel organisms, genome-wide information that describes functionally relevant variation may be obtained by RNA-Seq following de novo transcriptome assembly. While sequencing has become relatively inexpensive, the preparation of a large number of sequencing libraries remains prohibitively expensive for population genetic analyses of nonmodel species. Pooling samples may be then an attractive alternative. To test whether pooled RNA-Seq accurately predicts true allele frequencies, we analysed the liver transcriptomes of 10 bank voles. Each sample was sequenced both as an individually barcoded library and as a part of a pool. Equal amounts of total RNA from each vole were pooled prior to mRNA selection and library construction. Reads were mapped onto the de novo assembled reference transcriptome. High-quality genotypes for individual voles, determined for 23,682 SNPs, provided information on 'true' allele frequencies; allele frequencies estimated from the pool were then compared with these values. 'True' frequencies and those estimated from the pool were highly correlated. Mean relative estimation error was 21% and did not depend on expression level. However, we also observed a minor effect of interindividual variation in gene expression and allele-specific gene expression influencing allele frequency estimation accuracy. Moreover, we observed strong negative relationship between minor allele frequency and relative estimation error. Our results indicate that pooled RNA-Seq exhibits accuracy comparable with pooled genome resequencing, but variation in expression level between individuals should be assessed and accounted for. This should help in taking account the difference in accuracy between conservatively expressed transcripts and these which are variable in expression level.
Collapse
Affiliation(s)
- M Konczal
- Institute of Environmental Sciences, Jagiellonian University, Gronostajowa 7, 30-387, Kraków, Poland
| | | | | | | | | |
Collapse
|
32
|
Rellstab C, Zoller S, Tedder A, Gugerli F, Fischer MC. Validation of SNP allele frequencies determined by pooled next-generation sequencing in natural populations of a non-model plant species. PLoS One 2013; 8:e80422. [PMID: 24244686 PMCID: PMC3820589 DOI: 10.1371/journal.pone.0080422] [Citation(s) in RCA: 60] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2013] [Accepted: 10/02/2013] [Indexed: 11/28/2022] Open
Abstract
Sequencing of pooled samples (Pool-Seq) using next-generation sequencing technologies has become increasingly popular, because it represents a rapid and cost-effective method to determine allele frequencies for single nucleotide polymorphisms (SNPs) in population pools. Validation of allele frequencies determined by Pool-Seq has been attempted using an individual genotyping approach, but these studies tend to use samples from existing model organism databases or DNA stores, and do not validate a realistic setup for sampling natural populations. Here we used pyrosequencing to validate allele frequencies determined by Pool-Seq in three natural populations of Arabidopsis halleri (Brassicaceae). The allele frequency estimates of the pooled population samples (consisting of 20 individual plant DNA samples) were determined after mapping Illumina reads to (i) the publicly available, high-quality reference genome of a closely related species (Arabidopsis thaliana) and (ii) our own de novo draft genome assembly of A. halleri. We then pyrosequenced nine selected SNPs using the same individuals from each population, resulting in a total of 540 samples. Our results show a highly significant and accurate relationship between pooled and individually determined allele frequencies, irrespective of the reference genome used. Allele frequencies differed on average by less than 4%. There was no tendency that either the Pool-Seq or the individual-based approach resulted in higher or lower estimates of allele frequencies. Moreover, the rather high coverage in the mapping to the two reference genomes, ranging from 55 to 284x, had no significant effect on the accuracy of the Pool-Seq. A resampling analysis showed that only very low coverage values (below 10-20x) would substantially reduce the precision of the method. We therefore conclude that a pooled re-sequencing approach is well suited for analyses of genetic variation in natural populations.
Collapse
Affiliation(s)
- Christian Rellstab
- Biodiversity and Conservation Biology, Swiss Federal Research Institute WSL, Birmensdorf, Switzerland
| | - Stefan Zoller
- Genetic Diversity Centre, ETH Zürich, Zürich, Switzerland
| | - Andrew Tedder
- Institute of Evolutionary Biology and Environmental Studies and Institute of Plant Biology, University of Zürich, Zürich, Switzerland
| | - Felix Gugerli
- Biodiversity and Conservation Biology, Swiss Federal Research Institute WSL, Birmensdorf, Switzerland
| | | |
Collapse
|
33
|
Chun S, Plunkett J, Teramo K, Muglia LJ, Fay JC. Fine-mapping an association of FSHR with preterm birth in a Finnish population. PLoS One 2013; 8:e78032. [PMID: 24205076 PMCID: PMC3812121 DOI: 10.1371/journal.pone.0078032] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2013] [Accepted: 09/09/2013] [Indexed: 12/18/2022] Open
Abstract
Preterm birth is a complex disorder defined by gestations of less than 37 weeks. While preterm birth is estimated to have a significant genetic component, relative few genes have been associated with preterm birth. Polymorphism in one such gene, follicle-stimulating hormone receptor (FSHR), has been associated with preterm birth in Finnish and African American mothers but not other populations. To refine the genetic association of FSHR with preterm birth we conducted a fine-mapping study at the FSHR locus in a Finnish cohort. We sequenced a total of 44 kb, including protein-coding and conserved non-coding regions, in 127 preterm and 135 term mothers. Overall, we identified 288 single nucleotide variants and 65 insertion/deletions of 1-2 bp across all subjects. While no common SNPs in protein-coding regions were associated with preterm birth, including one previously associated with timing of fertilization, multiple SNPs spanning the first and second intron showed the strongest associations. Analysis of the associated SNPs revealed that they form both a protective (OR = 0.50, 95% CI = 0.25-0.93) as well as a risk (OR = 1.89, 95% CI = 1.08-3.39) haplotype with independent effects. In these haplotypes, two SNPs, rs12052281 and rs72822025, were predicted to disrupt ZEB1 and ELF3 transcription factor binding sites, respectively. Our results show that multiple haplotypes at FSHR are associated with preterm birth and we discuss the frequency and structure of these haplotypes outside of the Finnish population as a potential explanation for the absence of FSHR associations in some populations.
Collapse
Affiliation(s)
- Sung Chun
- Computational and Systems Biology Program, Washington University, St. Louis, Missouri, United States of America
| | - Jevon Plunkett
- Program in Human and Statistical Genetics, Washington University, St. Louis, Missouri, United States of America
| | - Kari Teramo
- Department of Obstetrics and Gynecology, Helsinki University Central Hospital, Helsinki, Finland
| | - Louis J. Muglia
- Center for Prevention of Preterm Birth, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, United States of America
| | - Justin C. Fay
- Computational and Systems Biology Program, Washington University, St. Louis, Missouri, United States of America
- Department of Genetics and Center for Genome Sciences and Systems Biology, Washington University, St. Louis, Missouri, United States of America
| |
Collapse
|
34
|
Cao CC, Li C, Huang Z, Ma X, Sun X. Identifying rare variants with optimal depth of coverage and cost-effective overlapping pool sequencing. Genet Epidemiol 2013; 37:820-30. [PMID: 24166758 DOI: 10.1002/gepi.21769] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2013] [Revised: 09/09/2013] [Accepted: 09/27/2013] [Indexed: 01/19/2023]
Abstract
Genome-wide association studies have identified hundreds of genetic variants associated with complex diseases although most variants identified so far explain only a small proportion of heritability, suggesting that rare variants are responsible for missing heritability. Identification of rare variants through large-scale resequencing becomes increasing important but still prohibitively expensive despite the rapid decline in the sequencing costs. Nevertheless, group testing based overlapping pool sequencing in which pooled rather than individual samples are sequenced will greatly reduces the efforts of sample preparation as well as the costs to screen for rare variants. Here, we proposed an overlapping pool sequencing to screen rare variants with optimal sequencing depth and a corresponding cost model. We formulated a model to compute the optimal depth for sufficient observations of variants in pooled sequencing. Utilizing shifted transversal design algorithm, appropriate parameters for overlapping pool sequencing could be selected to minimize cost and guarantee accuracy. Due to the mixing constraint and high depth for pooled sequencing, results showed that it was more cost-effective to divide a large population into smaller blocks which were tested using optimized strategies independently. Finally, we conducted an experiment to screen variant carriers with frequency equaled 1%. With simulated pools and publicly available human exome sequencing data, the experiment achieved 99.93% accuracy. Utilizing overlapping pool sequencing, the cost for screening variant carriers with frequency equaled 1% in 200 diploid individuals dropped to at least 66% at which target sequencing region was set to 30 Mb.
Collapse
Affiliation(s)
- Chang-Chang Cao
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, China
| | | | | | | | | |
Collapse
|
35
|
Ferretti L, Ramos-Onsins SE, Pérez-Enciso M. Population genomics from pool sequencing. Mol Ecol 2013; 22:5561-76. [DOI: 10.1111/mec.12522] [Citation(s) in RCA: 109] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2011] [Revised: 08/03/2013] [Accepted: 09/06/2013] [Indexed: 11/30/2022]
Affiliation(s)
- Luca Ferretti
- Center for Research in Agricultural Genomics (CRAG); UAB 08193 Bellaterra Spain
| | | | - Miguel Pérez-Enciso
- Center for Research in Agricultural Genomics (CRAG); UAB 08193 Bellaterra Spain
- Department of Animal Science and Food; Faculty of Veterinary; Universitat Autonoma de Barcelona; 08193 Bellaterra Spain
- Institut Català de Recerca i Estudis Avancats (ICREA); Passeig Lluís Companys 23 08010 Barcelona Spain
| |
Collapse
|
36
|
Kosugi S, Natsume S, Yoshida K, MacLean D, Cano L, Kamoun S, Terauchi R. Coval: improving alignment quality and variant calling accuracy for next-generation sequencing data. PLoS One 2013; 8:e75402. [PMID: 24116042 PMCID: PMC3792961 DOI: 10.1371/journal.pone.0075402] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2013] [Accepted: 08/14/2013] [Indexed: 11/26/2022] Open
Abstract
Accurate identification of DNA polymorphisms using next-generation sequencing technology is challenging because of a high rate of sequencing error and incorrect mapping of reads to reference genomes. Currently available short read aligners and DNA variant callers suffer from these problems. We developed the Coval software to improve the quality of short read alignments. Coval is designed to minimize the incidence of spurious alignment of short reads, by filtering mismatched reads that remained in alignments after local realignment and error correction of mismatched reads. The error correction is executed based on the base quality and allele frequency at the non-reference positions for an individual or pooled sample. We demonstrated the utility of Coval by applying it to simulated genomes and experimentally obtained short-read data of rice, nematode, and mouse. Moreover, we found an unexpectedly large number of incorrectly mapped reads in ‘targeted’ alignments, where the whole genome sequencing reads had been aligned to a local genomic segment, and showed that Coval effectively eliminated such spurious alignments. We conclude that Coval significantly improves the quality of short-read sequence alignments, thereby increasing the calling accuracy of currently available tools for SNP and indel identification. Coval is available at http://sourceforge.net/projects/coval105/.
Collapse
Affiliation(s)
- Shunichi Kosugi
- Iwate Biotechnology Research Center, Kitakami, Iwate, Japan
- Kazusa DNA Research Institute, Kisarazu, Chiba, Japan
- * E-mail: (SK); (RT)
| | | | | | - Daniel MacLean
- The Sainsbury Laboratory, Norwich Research Park, Norwich, United Kingdom
| | - Liliana Cano
- The Sainsbury Laboratory, Norwich Research Park, Norwich, United Kingdom
| | - Sophien Kamoun
- The Sainsbury Laboratory, Norwich Research Park, Norwich, United Kingdom
| | - Ryohei Terauchi
- Iwate Biotechnology Research Center, Kitakami, Iwate, Japan
- * E-mail: (SK); (RT)
| |
Collapse
|
37
|
Haller G, Kapoor M, Budde J, Xuei X, Edenberg H, Nurnberger J, Kramer J, Brooks A, Tischfield J, Almasy L, Agrawal A, Bucholz K, Rice J, Saccone N, Bierut L, Goate A. Rare missense variants in CHRNB3 and CHRNA3 are associated with risk of alcohol and cocaine dependence. Hum Mol Genet 2013; 23:810-9. [PMID: 24057674 DOI: 10.1093/hmg/ddt463] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Previous findings have demonstrated that variants in nicotinic receptor genes are associated with nicotine, alcohol and cocaine dependence. Because of the substantial comorbidity, it has often been unclear whether a variant is associated with multiple substances or whether the association is actually with a single substance. To investigate the possible contribution of rare variants to the development of substance dependencies other than nicotine dependence, specifically alcohol and cocaine dependence, we undertook pooled sequencing of the coding regions and flanking sequence of CHRNA5, CHRNA3, CHRNB4, CHRNA6 and CHRNB3 in 287 African American and 1028 European American individuals from the Collaborative Study of the Genetics of Alcoholism (COGA). All members of families for whom any individual was sequenced (2504 African Americans and 7318 European Americans) were then genotyped for all variants identified by sequencing. For each gene, we then tested for association using FamSKAT. For European Americans, we find increased DSM-IV cocaine dependence symptoms (FamSKAT P = 2 × 10(-4)) and increased DSM-IV alcohol dependence symptoms (FamSKAT P = 5 × 10(-4)) among carriers of missense variants in CHRNB3. Additionally, one variant (rs149775276; H329Y) shows association with both cocaine dependence symptoms (P = 7.4 × 10(-5), β = 2.04) and alcohol dependence symptoms (P = 2.6 × 10(-4), β = 2.04). For African Americans, we find decreased cocaine dependence symptoms among carriers of missense variants in CHRNA3 (FamSKAT P = 0.005). Replication in an independent sample supports the role of rare variants in CHRNB3 and alcohol dependence (P = 0.006). These are the first results to implicate rare variants in CHRNB3 or CHRNA3 in risk for alcohol dependence or cocaine dependence.
Collapse
|
38
|
Benitez BA, Karch CM, Cai Y, Jin SC, Cooper B, Carrell D, Bertelsen S, Chibnik L, Schneider JA, Bennett DA, Fagan AM, Holtzman D, Morris JC, Goate AM, Cruchaga C. The PSEN1, p.E318G variant increases the risk of Alzheimer's disease in APOE-ε4 carriers. PLoS Genet 2013; 9:e1003685. [PMID: 23990795 PMCID: PMC3750021 DOI: 10.1371/journal.pgen.1003685] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2013] [Accepted: 06/14/2013] [Indexed: 01/18/2023] Open
Abstract
The primary constituents of plaques (Aβ42/Aβ40) and neurofibrillary tangles (tau and phosphorylated forms of tau [ptau]) are the current leading diagnostic and prognostic cerebrospinal fluid (CSF) biomarkers for AD. In this study, we performed deep sequencing of APP, PSEN1, PSEN2, GRN, APOE and MAPT genes in individuals with extreme CSF Aβ42, tau, or ptau levels. One known pathogenic mutation (PSEN1 p.A426P), four high-risk variants for AD (APOE p.L46P, MAPT p.A152T, PSEN2 p.R62H and p.R71W) and nine novel variants were identified. Surprisingly, a coding variant in PSEN1, p.E318G (rs17125721-G) exhibited a significant association with high CSF tau (p = 9.2 × 10(-4)) and ptau (p = 1.8 × 10(-3)) levels. The association of the p.E318G variant with Aβ deposition was observed in APOE-ε4 allele carriers. Furthermore, we found that in a large case-control series (n = 5,161) individuals who are APOE-ε4 carriers and carry the p.E318G variant are at a risk of developing AD (OR = 10.7, 95% CI = 4.7-24.6) that is similar to APOE-ε4 homozygous (OR = 9.9, 95% CI = 7.2.9-13.6), and double the risk for APOE-ε4 carriers that do not carry p.E318G (OR = 3.9, 95% CI = 3.4-4.4). The p.E318G variant is present in 5.3% (n = 30) of the families from a large clinical series of LOAD families (n = 565) and exhibited a higher frequency in familial LOAD (MAF = 2.5%) than in sporadic LOAD (MAF = 1.6%) (p = 0.02). Additionally, we found that in the presence of at least one APOE-ε4 allele, p.E318G is associated with more Aβ plaques and faster cognitive decline. We demonstrate that the effect of PSEN1, p.E318G on AD susceptibility is largely dependent on an interaction with APOE-ε4 and mediated by an increased burden of Aβ deposition.
Collapse
Affiliation(s)
- Bruno A. Benitez
- Department of Psychiatry, School of Medicine, Washington University, St. Louis, Missouri, United States of America
| | - Celeste M. Karch
- Department of Psychiatry, School of Medicine, Washington University, St. Louis, Missouri, United States of America
| | - Yefei Cai
- Department of Psychiatry, School of Medicine, Washington University, St. Louis, Missouri, United States of America
| | - Sheng Chih Jin
- Department of Psychiatry, School of Medicine, Washington University, St. Louis, Missouri, United States of America
| | - Breanna Cooper
- Department of Psychiatry, School of Medicine, Washington University, St. Louis, Missouri, United States of America
| | - David Carrell
- Department of Psychiatry, School of Medicine, Washington University, St. Louis, Missouri, United States of America
| | - Sarah Bertelsen
- Department of Psychiatry, School of Medicine, Washington University, St. Louis, Missouri, United States of America
| | - Lori Chibnik
- Program in Translational NeuroPsychiatric Genomics, Institute for the Neurosciences Department of Neurology, Brigham and Women's Hospital, Boston, Massachusetts, United States of America
- Harvard Medical School, Boston, Massachusetts, United States of America
- Program in Medical and Population Genetics, Broad Institute of Harvard University and M.I.T., Cambridge, Massachusetts, United States of America
| | - Julie A. Schneider
- Rush Alzheimer's Disease Center and Department of Neurological Sciences, Rush University Medical Center, Chicago, Illinois, United States of America
| | - David A. Bennett
- Rush Alzheimer's Disease Center and Department of Neurological Sciences, Rush University Medical Center, Chicago, Illinois, United States of America
| | | | | | - Anne M. Fagan
- Department of Neurology, School of Medicine, Washington University, St. Louis, Missouri, United States of America
- Hope Center Program on Protein Aggregation and Neurodegeneration, Washington University St. Louis, Missouri, United States of America
| | - David Holtzman
- Department of Neurology, School of Medicine, Washington University, St. Louis, Missouri, United States of America
- Hope Center Program on Protein Aggregation and Neurodegeneration, Washington University St. Louis, Missouri, United States of America
| | - John C. Morris
- Department of Neurology, School of Medicine, Washington University, St. Louis, Missouri, United States of America
- Hope Center Program on Protein Aggregation and Neurodegeneration, Washington University St. Louis, Missouri, United States of America
| | - Alison M. Goate
- Department of Psychiatry, School of Medicine, Washington University, St. Louis, Missouri, United States of America
- Department of Neurology, School of Medicine, Washington University, St. Louis, Missouri, United States of America
- Hope Center Program on Protein Aggregation and Neurodegeneration, Washington University St. Louis, Missouri, United States of America
- Department of Genetics, School of Medicine, Washington University, St. Louis, Missouri, United States of America
| | - Carlos Cruchaga
- Department of Psychiatry, School of Medicine, Washington University, St. Louis, Missouri, United States of America
- Hope Center Program on Protein Aggregation and Neurodegeneration, Washington University St. Louis, Missouri, United States of America
- * E-mail:
| |
Collapse
|
39
|
He Z, Li X, Ling S, Fu YX, Hungate E, Shi S, Wu CI. Estimating DNA polymorphism from next generation sequencing data with high error rate by dual sequencing applications. BMC Genomics 2013; 14:535. [PMID: 23919637 PMCID: PMC3750404 DOI: 10.1186/1471-2164-14-535] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2013] [Accepted: 08/03/2013] [Indexed: 11/10/2022] Open
Abstract
Background As the error rate is high and the distribution of errors across sites is non-uniform in next generation sequencing (NGS) data, it has been a challenge to estimate DNA polymorphism (θ) accurately from NGS data. Results By computer simulations, we compare the two methods of data acquisition - sequencing each diploid individual separately and sequencing the pooled sample. Under the current NGS error rate, sequencing each individual separately offers little advantage unless the coverage per individual is high (>20X). We hence propose a new method for estimating θ from pooled samples that have been subjected to two separate rounds of DNA sequencing. Since errors from the two sequencing applications are usually non-overlapping, it is possible to separate low frequency polymorphisms from sequencing errors. Simulation results show that the dual applications method is reliable even when the error rate is high and θ is low. Conclusions In studies of natural populations where the sequencing coverage is usually modest (~2X per individual), the dual applications method on pooled samples should be a reasonable choice.
Collapse
Affiliation(s)
- Ziwen He
- State Key Laboratory of Biocontrol and Guangdong Key Laboratory of Plant Resources, Sun Yat-sen University, 135 Xingang West Road, Guangzhou 510275, China
| | | | | | | | | | | | | |
Collapse
|
40
|
Diaw L, Youngblood V, Taylor JG. Introduction to next-generation nucleic acid sequencing in cardiovascular disease research. Methods Mol Biol 2013; 1027:157-79. [PMID: 23912986 DOI: 10.1007/978-1-60327-369-5_7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/16/2023]
Abstract
The identification of new genomic paradigms in lipoprotein and cardiovascular diseases will be accelerated by the application of the recent technological advances in nucleic acid sequencing. Presently, large-scale genomics facilities are equipped to accomplish this objective with a combination of "next-generation" DNA sequencing chemistries, largely focused on assembling massively parallel sequence reads corresponding to complete genes, entire exomes, or whole genomes from populations of individuals. In the future, individual laboratories will also use this emerging technology for focused genomic studies with the use of a combination of next-generation sequencing and automated Sanger sequencing. In particular, -next-generation sequencing will play an increasingly important role when applied to chromatin -immunoprecipitation, RNA transcriptome analysis, and studies of human genetic variation and mutation in carefully phenotyped healthy and disease populations. In this chapter, a brief overview of recent technological advances in next-generation nucleic acid sequencing is presented, with emphasis on practical -application to clinical studies in cardiovascular diseases.
Collapse
Affiliation(s)
- Lena Diaw
- Pulmonary and Vascular Medicine Branch, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | | | | |
Collapse
|
41
|
Tan MK, Koval J, Ghalayini A. Novel genetic variants of GA-insensitive Rht-1 genes in hexaploid wheat and their potential agronomic value. PLoS One 2013; 8:e69690. [PMID: 23894524 PMCID: PMC3716649 DOI: 10.1371/journal.pone.0069690] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2012] [Accepted: 06/13/2013] [Indexed: 01/01/2023] Open
Abstract
This study has found numerous novel genetic variants of GA-insensitive dwarfing genes with potential agricultural value for crop improvement. The cultivar, Spica is a tall genotype and possesses the wild-type genes of Rht-A1a, Rht-B1a and Rht-D1a. The cultivar Quarrion possesses a null mutant in the DELLA motif in each of the 3 genomes. This is a first report of a null mutant of Rht-A1. In addition, novel null mutants which differ from reported null alleles of Rht-B1b, Rht-B1e and Rht-D1b have been found in Quarrion, Carnamah and Whistler. The accession, Aus1408 has an allele of Rht-B1 with a mutation in the conserved ‘TVHYNP’ N-terminal signal binding domain with possible implications on its sensitivity to GA. Mutations in the conserved C-terminal GRAS domain of Rht-A1 alleles with possible effects on expression have been found in WW1842, Quarrion and Drysdale. Genetic variants with putative spliceosomal introns in the GRAS domain have been found in all accessions except Spica. Genome-specific cis-sequences about 124 bp upstream of the start codon of the Rht-1 gene have been identified for each of the three genomes.
Collapse
Affiliation(s)
- Mui-Keng Tan
- Elizabeth Macarthur Agricultural Institute, New South Wales (NSW) Department of Primary Industries, Menangle, New South Wales, Australia.
| | | | | |
Collapse
|
42
|
Fernandez-Mercado M, Burns A, Pellagatti A, Giagounidis A, Germing U, Agirre X, Prosper F, Aul C, Killick S, Wainscoat JS, Schuh A, Boultwood J. Targeted re-sequencing analysis of 25 genes commonly mutated in myeloid disorders in del(5q) myelodysplastic syndromes. Haematologica 2013; 98:1856-64. [PMID: 23831921 DOI: 10.3324/haematol.2013.086686] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Interstitial deletion of chromosome 5q is the most common chromosomal abnormality in myelodysplastic syndromes. The catalogue of genes involved in the molecular pathogenesis of myelodysplastic syndromes is rapidly expanding and next-generation sequencing technology allows detection of these mutations at great depth. Here we describe the design, validation and application of a targeted next-generation sequencing approach to simultaneously screen 25 genes mutated in myeloid malignancies. We used this method alongside single nucleotide polymorphism-array technology to characterize the mutational and cytogenetic profile of 43 cases of early or advanced del(5q) myelodysplastic syndromes. A total of 29 mutations were detected in our cohort. Overall, 45% of early and 66.7% of advanced cases had at least one mutation. Genes with the highest mutation frequency among advanced cases were TP53 and ASXL1 (25% of patients each). These showed a lower mutation frequency in cases of 5q- syndrome (4.5% and 13.6%, respectively), suggesting a role in disease progression in del(5q) myelodysplastic syndromes. Fifty-two percent of mutations identified were in genes involved in epigenetic regulation (ASXL1, TET2, DNMT3A and JAK2). Six mutations had allele frequencies <20%, likely below the detection limit of traditional sequencing methods. Genomic array data showed that cases of advanced del(5q) myelodysplastic syndrome had a complex background of cytogenetic aberrations, often encompassing genes involved in myeloid disorders. Our study is the first to investigate the molecular pathogenesis of early and advanced del(5q) myelodysplastic syndromes using next-generation sequencing technology on a large panel of genes frequently mutated in myeloid malignancies, further illuminating the molecular landscape of del(5q) myelodysplastic syndromes.
Collapse
|
43
|
Osborne AJ, Zavodna M, Chilvers BL, Robertson BC, Negro SS, Kennedy MA, Gemmell NJ. Extensive variation at MHC DRB in the New Zealand sea lion (Phocarctos hookeri) provides evidence for balancing selection. Heredity (Edinb) 2013; 111:44-56. [PMID: 23572124 PMCID: PMC3692317 DOI: 10.1038/hdy.2013.18] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2012] [Revised: 12/20/2012] [Accepted: 01/28/2013] [Indexed: 11/09/2022] Open
Abstract
Marine mammals are often reported to possess reduced variation of major histocompatibility complex (MHC) genes compared with their terrestrial counterparts. We evaluated diversity at two MHC class II B genes, DQB and DRB, in the New Zealand sea lion (Phocarctos hookeri, NZSL) a species that has suffered high mortality owing to bacterial epizootics, using Sanger sequencing and haplotype reconstruction, together with next-generation sequencing. Despite this species' prolonged history of small population size and highly restricted distribution, we demonstrate extensive diversity at MHC DRB with 26 alleles, whereas MHC DQB is dimorphic. We identify four DRB codons, predicted to be involved in antigen binding, that are evolving under adaptive evolution. Our data suggest diversity at DRB may be maintained by balancing selection, consistent with the role of this locus as an antigen-binding region and the species' recent history of mass mortality during a series of bacterial epizootics. Phylogenetic analyses of DQB and DRB sequences from pinnipeds and other carnivores revealed significant allelic diversity, but little phylogenetic depth or structure among pinniped alleles; thus, we could neither confirm nor refute the possibility of trans-species polymorphism in this group. The phylogenetic pattern observed however, suggests some significant evolutionary constraint on these loci in the recent past, with the pattern consistent with that expected following an epizootic event. These data may help further elucidate some of the genetic factors underlying the unusually high susceptibility to bacterial infection of the threatened NZSL, and help us to better understand the extent and pattern of MHC diversity in pinnipeds.
Collapse
Affiliation(s)
- A J Osborne
- Centre for Reproduction and Genomics, Department of Anatomy, University of Otago, Dunedin, New Zealand.
| | | | | | | | | | | | | |
Collapse
|
44
|
McRae AF, Richter MM, Lind PA. Case-control association testing of common variants from sequencing of DNA pools. PLoS One 2013; 8:e65410. [PMID: 23762362 PMCID: PMC3676437 DOI: 10.1371/journal.pone.0065410] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2013] [Accepted: 04/25/2013] [Indexed: 12/05/2022] Open
Abstract
While genome-wide association studies (GWAS) have been successful in identifying a large number of variants associated with disease, the challenge of locating the underlying causal loci remains. Sequencing of case and control DNA pools provides an inexpensive method for assessing all variation in a genomic region surrounding a significant GWAS result. However, individual variants need to be ranked in terms of the strength of their association to disease in order to prioritise follow-up by individual genotyping. A simple method for testing for case-control association in sequence data from DNA pools is presented that allows the partitioning of the variance in allele frequency estimates into components due to the sampling of chromosomes from the pool during sequencing, sampling individuals from the population and unequal contribution from individuals during pool construction. The utility of this method is demonstrated on a sequence from the alcohol dehydrogenase (ADH) gene cluster on a case-control sample for heavy alcohol consumption.
Collapse
Affiliation(s)
- Allan F McRae
- University of Queensland Diamantina Institute, Brisbane, Australia.
| | | | | |
Collapse
|
45
|
Gautier M, Foucaud J, Gharbi K, Cézard T, Galan M, Loiseau A, Thomson M, Pudlo P, Kerdelhué C, Estoup A. Estimation of population allele frequencies from next-generation sequencing data: pool-versus individual-based genotyping. Mol Ecol 2013; 22:3766-79. [PMID: 23730833 DOI: 10.1111/mec.12360] [Citation(s) in RCA: 145] [Impact Index Per Article: 12.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2013] [Revised: 04/15/2013] [Accepted: 04/16/2013] [Indexed: 12/16/2022]
Abstract
Molecular markers produced by next-generation sequencing (NGS) technologies are revolutionizing genetic research. However, the costs of analysing large numbers of individual genomes remain prohibitive for most population genetics studies. Here, we present results based on mathematical derivations showing that, under many realistic experimental designs, NGS of DNA pools from diploid individuals allows to estimate the allele frequencies at single nucleotide polymorphisms (SNPs) with at least the same accuracy as individual-based analyses, for considerably lower library construction and sequencing efforts. These findings remain true when taking into account the possibility of substantially unequal contributions of each individual to the final pool of sequence reads. We propose the intuitive notion of effective pool size to account for unequal pooling and derive a Bayesian hierarchical model to estimate this parameter directly from the data. We provide a user-friendly application assessing the accuracy of allele frequency estimation from both pool- and individual-based NGS population data under various sampling, sequencing depth and experimental error designs. We illustrate our findings with theoretical examples and real data sets corresponding to SNP loci obtained using restriction site-associated DNA (RAD) sequencing in pool- and individual-based experiments carried out on the same population of the pine processionary moth (Thaumetopoea pityocampa). NGS of DNA pools might not be optimal for all types of studies but provides a cost-effective approach for estimating allele frequencies for very large numbers of SNPs. It thus allows comparison of genome-wide patterns of genetic variation for large numbers of individuals in multiple populations.
Collapse
Affiliation(s)
- Mathieu Gautier
- INRA, UMR CBGP (INRA-IRD-Cirad-Montpellier SupAgro), Campus international de Baillarguet, CS 30016, F-34988, Montferrier-sur-Lez, France.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
46
|
Zavodna M, Grueber CE, Gemmell NJ. Parallel tagged next-generation sequencing on pooled samples - a new approach for population genetics in ecology and conservation. PLoS One 2013; 8:e61471. [PMID: 23637841 PMCID: PMC3630221 DOI: 10.1371/journal.pone.0061471] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2012] [Accepted: 03/08/2013] [Indexed: 12/02/2022] Open
Abstract
Next-generation sequencing (NGS) on pooled samples has already been broadly applied in human medical diagnostics and plant and animal breeding. However, thus far it has been only sparingly employed in ecology and conservation, where it may serve as a useful diagnostic tool for rapid assessment of species genetic diversity and structure at the population level. Here we undertake a comprehensive evaluation of the accuracy, practicality and limitations of parallel tagged amplicon NGS on pooled population samples for estimating species population diversity and structure. We obtained 16S and Cyt b data from 20 populations of Leiopelma hochstetteri, a frog species of conservation concern in New Zealand, using two approaches - parallel tagged NGS on pooled population samples and individual Sanger sequenced samples. Data from each approach were then used to estimate two standard population genetic parameters, nucleotide diversity (π) and population differentiation (FST), that enable population genetic inference in a species conservation context. We found a positive correlation between our two approaches for population genetic estimates, showing that the pooled population NGS approach is a reliable, rapid and appropriate method for population genetic inference in an ecological and conservation context. Our experimental design also allowed us to identify both the strengths and weaknesses of the pooled population NGS approach and outline some guidelines and suggestions that might be considered when planning future projects.
Collapse
Affiliation(s)
- Monika Zavodna
- Centre for Reproduction and Genomics, Department of Anatomy, University of Otago, Dunedin, New Zealand
| | - Catherine E. Grueber
- Centre for Reproduction and Genomics, Department of Anatomy, University of Otago, Dunedin, New Zealand
- Department of Zoology, University of Otago, Dunedin, New Zealand
- Allan Wilson Centre for Molecular Ecology and Evolution, University of Otago, Dunedin, New Zealand
| | - Neil J. Gemmell
- Centre for Reproduction and Genomics, Department of Anatomy, University of Otago, Dunedin, New Zealand
- Allan Wilson Centre for Molecular Ecology and Evolution, University of Otago, Dunedin, New Zealand
| |
Collapse
|
47
|
Abstract
Advances in DNA sequencing provide tools for efficient large-scale discovery of markers for use in plants. Discovery options include large-scale amplicon sequencing, transcriptome sequencing, gene-enriched genome sequencing and whole genome sequencing. Examples of each of these approaches and their potential to generate molecular markers for specific applications have been described. Sequencing the whole genome of parents identifies all the polymorphisms available for analysis in their progeny. Sequencing PCR amplicons of sets of candidate genes from DNA bulks can be used to define the available variation in these genes that might be exploited in a population or germplasm collection. Sequencing of the transcriptomes of genotypes varying for the trait of interest may identify genes with patterns of expression that could explain the phenotypic variation. Sequencing genomic DNA enriched for genes by hybridization with probes for all or some of the known genes simplifies sequencing and analysis of differences in gene sequences between large numbers of genotypes and genes especially when working with complex genomes. Examples of application of the above-mentioned techniques have been described.
Collapse
|
48
|
|
49
|
Abstract
Background Structural variations in human genomes, such as deletions, play an important role in cancer development. Next-Generation Sequencing technologies have been central in providing ways to detect such variations. Methods like paired-end mapping allow to simultaneously analyze data from several samples in order to, e.g., distinguish tumor from patient specific variations. However, it has been shown that, especially in this setting, there is a need to explicitly take overlapping deletions into consideration. Existing tools have only minor capabilities to call overlapping deletions, unable to unravel complex signals to obtain consistent predictions. Result We present a first approach specifically designed to cluster short-read paired-end data into possibly overlapping deletion predictions. The method does not make any assumptions on the composition of the data, such as the number of samples, heterogeneity, polyploidy, etc. Taking paired ends mapped to a reference genome as input, it iteratively merges mappings to clusters based on a similarity score that takes both the putative location and size of a deletion into account. Conclusion We demonstrate that agglomerative clustering is suitable to predict deletions. Analyzing real data from three samples of a cancer patient, we found putatively overlapping deletions and observed that, as a side-effect, erroneous mappings are mostly identified as singleton clusters. An evaluation on simulated data shows, compared to other methods which can output overlapping clusters, high accuracy in separating overlapping from single deletions.
Collapse
Affiliation(s)
- Roland Wittler
- Genome Informatics, Faculty of Technology and Institute for Bioinformatics, Center for Biotechnology, Bielefeld University, 33594 Bielefeld, Germany.
| |
Collapse
|
50
|
Hiatt JB, Pritchard CC, Salipante SJ, O'Roak BJ, Shendure J. Single molecule molecular inversion probes for targeted, high-accuracy detection of low-frequency variation. Genome Res 2013; 23:843-54. [PMID: 23382536 PMCID: PMC3638140 DOI: 10.1101/gr.147686.112] [Citation(s) in RCA: 256] [Impact Index Per Article: 21.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
The detection and quantification of genetic heterogeneity in populations of cells is fundamentally important to diverse fields, ranging from microbial evolution to human cancer genetics. However, despite the cost and throughput advances associated with massively parallel sequencing, it remains challenging to reliably detect mutations that are present at a low relative abundance in a given DNA sample. Here we describe smMIP, an assay that combines single molecule tagging with multiplex targeted capture to enable practical and highly sensitive detection of low-frequency or subclonal variation. To demonstrate the potential of the method, we simultaneously resequenced 33 clinically informative cancer genes in eight cell line and 45 clinical cancer samples. Single molecule tagging facilitated extremely accurate consensus calling, with an estimated per-base error rate of 8.4 × 10(-6) in cell lines and 2.6 × 10(-5) in clinical specimens. False-positive mutations in the single molecule consensus base-calls exhibited patterns predominantly consistent with DNA damage, including 8-oxo-guanine and spontaneous deamination of cytosine. Based on mixing experiments with cell line samples, sensitivity for mutations above 1% frequency was 83% with no false positives. At clinically informative sites, we identified seven low-frequency point mutations (0.2%-4.7%), including BRAF p.V600E (melanoma, 0.2% alternate allele frequency), KRAS p.G12V (lung, 0.6%), JAK2 p.V617F (melanoma, colon, two lung, 0.3%-1.4%), and NRAS p.Q61R (colon, 4.7%). We anticipate that smMIP will be broadly adoptable as a practical and effective method for accurately detecting low-frequency mutations in both research and clinical settings.
Collapse
Affiliation(s)
- Joseph B Hiatt
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA.
| | | | | | | | | |
Collapse
|