751
|
Cheng DT, Cheng J, Mitchell TN, Syed A, Zehir A, Mensah NYT, Oultache A, Nafa K, Levine RL, Arcila ME, Berger MF, Hedvat CV. Detection of mutations in myeloid malignancies through paired-sample analysis of microdroplet-PCR deep sequencing data. J Mol Diagn 2014; 16:504-518. [PMID: 25017477 DOI: 10.1016/j.jmoldx.2014.05.006] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2013] [Revised: 04/24/2014] [Accepted: 05/09/2014] [Indexed: 01/10/2023] Open
Abstract
Amplicon-based methods for targeted resequencing of cancer genes have gained traction in the clinic as a strategy for molecular diagnostic testing. An 847-amplicon panel was designed with the RainDance DeepSeq system, covering most exons of 28 genes relevant to acute myeloid leukemia and myeloproliferative neoplasms. We developed a paired-sample analysis pipeline for variant calling and sought to assess its sensitivity and specificity relative to a set of samples with previously identified mutations. Thirty samples with known mutations in JAK2, NPM1, DNMT3A, MPL, IDH1, IDH2, CEBPA, and FLT3, were profiled and sequenced to high depth. Variant calling using an unmatched Hapmap DNA control removed a substantial number of artifactual calls regardless of algorithm used or variant class. The removed calls were nonunique, had lower variant frequencies, and tended to recur in multiple unrelated samples. Analysis of sample replicates revealed that reproducible calls had distinctly higher variant allele depths and frequencies compared to nonreproducible calls. On the basis of these differences, filters on variant frequency were chosen to select for reproducible calls. The analysis pipeline successfully retrieved the associated known variant in all tested samples and uncovered additional mutations in some samples corresponding to well-characterized hotspot mutations in acute myeloid leukemia. We have developed a paired-sample analysis pipeline capable of robust identification of mutations from microdroplet-PCR sequencing data with high sensitivity and specificity.
Collapse
Affiliation(s)
- Donavan T Cheng
- Molecular Diagnostics Service, Department of Pathology, Memorial Sloan-Kettering Cancer Center, New York, New York.
| | - Janice Cheng
- Molecular Diagnostics Service, Department of Pathology, Memorial Sloan-Kettering Cancer Center, New York, New York
| | - Talia N Mitchell
- Molecular Diagnostics Service, Department of Pathology, Memorial Sloan-Kettering Cancer Center, New York, New York
| | - Aijazuddin Syed
- Molecular Diagnostics Service, Department of Pathology, Memorial Sloan-Kettering Cancer Center, New York, New York
| | - Ahmet Zehir
- Molecular Diagnostics Service, Department of Pathology, Memorial Sloan-Kettering Cancer Center, New York, New York
| | - Nana Yaa T Mensah
- Molecular Diagnostics Service, Department of Pathology, Memorial Sloan-Kettering Cancer Center, New York, New York
| | - Alifya Oultache
- Molecular Diagnostics Service, Department of Pathology, Memorial Sloan-Kettering Cancer Center, New York, New York
| | - Khedoudja Nafa
- Molecular Diagnostics Service, Department of Pathology, Memorial Sloan-Kettering Cancer Center, New York, New York
| | - Ross L Levine
- Human Oncology and Pathogenesis Program, Memorial Sloan-Kettering Cancer Center, New York, New York
| | - Maria E Arcila
- Molecular Diagnostics Service, Department of Pathology, Memorial Sloan-Kettering Cancer Center, New York, New York
| | - Michael F Berger
- Molecular Diagnostics Service, Department of Pathology, Memorial Sloan-Kettering Cancer Center, New York, New York; Human Oncology and Pathogenesis Program, Memorial Sloan-Kettering Cancer Center, New York, New York
| | - Cyrus V Hedvat
- Molecular Diagnostics Service, Department of Pathology, Memorial Sloan-Kettering Cancer Center, New York, New York
| |
Collapse
|
752
|
Zare H, Wang J, Hu A, Weber K, Smith J, Nickerson D, Song C, Witten D, Blau CA, Noble WS. Inferring clonal composition from multiple sections of a breast cancer. PLoS Comput Biol 2014; 10:e1003703. [PMID: 25010360 PMCID: PMC4091710 DOI: 10.1371/journal.pcbi.1003703] [Citation(s) in RCA: 90] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2013] [Accepted: 05/20/2014] [Indexed: 12/13/2022] Open
Abstract
Cancers arise from successive rounds of mutation and selection, generating clonal populations that vary in size, mutational content and drug responsiveness. Ascertaining the clonal composition of a tumor is therefore important both for prognosis and therapy. Mutation counts and frequencies resulting from next-generation sequencing (NGS) potentially reflect a tumor's clonal composition; however, deconvolving NGS data to infer a tumor's clonal structure presents a major challenge. We propose a generative model for NGS data derived from multiple subsections of a single tumor, and we describe an expectation-maximization procedure for estimating the clonal genotypes and relative frequencies using this model. We demonstrate, via simulation, the validity of the approach, and then use our algorithm to assess the clonal composition of a primary breast cancer and associated metastatic lymph node. After dividing the tumor into subsections, we perform exome sequencing for each subsection to assess mutational content, followed by deep sequencing to precisely count normal and variant alleles within each subsection. By quantifying the frequencies of 17 somatic variants, we demonstrate that our algorithm predicts clonal relationships that are both phylogenetically and spatially plausible. Applying this method to larger numbers of tumors should cast light on the clonal evolution of cancers in space and time. Cancers arise from a series of mutations that occur over time. As a result, as a tumor grows each cell inherits a distinctive genotype, defined by the set of all somatic mutations that distinguish the tumor cell from normal cells. Acertaining these genotype patterns, and identifying which ones are associated with the growth of the cancer and its ability to metastasize, can potentially give clinicians insights into how to treat the cancer. In this work, we describe a method for inferring the predominant genotypes within a single tumor. The method requires that a tumor be sectioned and that each section be subjected to a high-throughput sequencing procedure. The resulting mutations and their associated frequencies within each tumor section are then used as input to a probabilistic model that infers the underlying genotypes and their relative frequencies within the tumor. We use simulated data to demonstrate the validity of the approach, and then we apply our algorithm to data from a primary breast cancer and associated metastatic lymph node. We demonstrate that our algorithm predicts genotypes that are consistent with an evolutionary model and with the physical topology of the tumor itself. Applying this method to larger numbers of tumors should cast light on the evolution of cancers in space and time.
Collapse
Affiliation(s)
- Habil Zare
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
| | - Junfeng Wang
- Division of Hematology, Department of Medicine, University of Washington, Seattle, Washington, United States of America
| | - Alex Hu
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
| | - Kris Weber
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
| | - Josh Smith
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
| | - Debbie Nickerson
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
| | - ChaoZhong Song
- Division of Hematology, Department of Medicine, University of Washington, Seattle, Washington, United States of America
| | - Daniela Witten
- Department of Biostatistics, University of Washington, Seattle, Washington, United States of America
- * E-mail: (DW); (CAB); (WSN)
| | - C. Anthony Blau
- Division of Hematology, Department of Medicine, University of Washington, Seattle, Washington, United States of America
- * E-mail: (DW); (CAB); (WSN)
| | - William Stafford Noble
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
- Department of Computer Science and Engineering, University of Washington, Seattle, Washington, United States of America
- * E-mail: (DW); (CAB); (WSN)
| |
Collapse
|
753
|
Abstract
High-throughput DNA sequencing has revolutionized the study of cancer genomics with numerous discoveries that are relevant to cancer diagnosis and treatment. The latest sequencing and analysis methods have successfully identified somatic alterations, including single-nucleotide variants, insertions and deletions, copy-number aberrations, structural variants and gene fusions. Additional computational techniques have proved useful for defining the mutations, genes and molecular networks that drive diverse cancer phenotypes and that determine clonal architectures in tumour samples. Collectively, these tools have advanced the study of genomic, transcriptomic and epigenomic alterations in cancer, and their association to clinical properties. Here, we review cancer genomics software and the insights that have been gained from their application.
Collapse
|
754
|
Mapping-by-sequencing identifies HvPHYTOCHROME C as a candidate gene for the early maturity 5 locus modulating the circadian clock and photoperiodic flowering in barley. Genetics 2014; 198:383-96. [PMID: 24996910 PMCID: PMC4174949 DOI: 10.1534/genetics.114.165613] [Citation(s) in RCA: 59] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
Phytochromes play an important role in light signaling and photoperiodic control of flowering time in plants. Here we propose that the red/far-red light photoreceptor HvPHYTOCHROME C (HvPHYC), carrying a mutation in a conserved region of the GAF domain, is a candidate underlying the early maturity 5 locus in barley (Hordeum vulgare L.). We fine mapped the gene using a mapping-by-sequencing approach applied on the whole-exome capture data from bulked early flowering segregants derived from a backcross of the Bowman(eam5) introgression line. We demonstrate that eam5 disrupts circadian expression of clock genes. Moreover, it interacts with the major photoperiod response gene Ppd-H1 to accelerate flowering under noninductive short days. Our results suggest that HvPHYC participates in transmission of light signals to the circadian clock and thus modulates light-dependent processes such as photoperiodic regulation of flowering.
Collapse
|
755
|
Boczonadi V, Müller JS, Pyle A, Munkley J, Dor T, Quartararo J, Ferrero I, Karcagi V, Giunta M, Polvikoski T, Birchall D, Princzinger A, Cinnamon Y, Lützkendorf S, Piko H, Reza M, Florez L, Santibanez-Koref M, Griffin H, Schuelke M, Elpeleg O, Kalaydjieva L, Lochmüller H, Elliott DJ, Chinnery PF, Edvardson S, Horvath R. EXOSC8 mutations alter mRNA metabolism and cause hypomyelination with spinal muscular atrophy and cerebellar hypoplasia. Nat Commun 2014; 5:4287. [PMID: 24989451 PMCID: PMC4102769 DOI: 10.1038/ncomms5287] [Citation(s) in RCA: 112] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2014] [Accepted: 06/03/2014] [Indexed: 12/21/2022] Open
Abstract
The exosome is a multi-protein complex, required for the degradation of AU-rich element (ARE) containing messenger RNAs (mRNAs). EXOSC8 is an essential protein of the exosome core, as its depletion causes a severe growth defect in yeast. Here we show that homozygous missense mutations in EXOSC8 cause progressive and lethal neurological disease in 22 infants from three independent pedigrees. Affected individuals have cerebellar and corpus callosum hypoplasia, abnormal myelination of the central nervous system or spinal motor neuron disease. Experimental downregulation of EXOSC8 in human oligodendroglia cells and in zebrafish induce a specific increase in ARE mRNAs encoding myelin proteins, showing that the imbalanced supply of myelin proteins causes the disruption of myelin, and explaining the clinical presentation. These findings show the central role of the exosomal pathway in neurodegenerative disease. The exosome is responsible for mRNA degradation, which is an important step in the regulation of gene expression. Here the authors report that homozygous missense mutations in the exosome subunit, EXOSC8, may cause neurodegenerative disease in infants through the dysregulation of myelin expression.
Collapse
Affiliation(s)
- Veronika Boczonadi
- 1] Institute of Genetic Medicine, Wellcome Trust Centre for Mitochondrial Research, Newcastle University, Central Parkway, Newcastle upon Tyne NE1 3BZ, UK [2]
| | - Juliane S Müller
- 1] Institute of Genetic Medicine, Wellcome Trust Centre for Mitochondrial Research, Newcastle University, Central Parkway, Newcastle upon Tyne NE1 3BZ, UK [2]
| | - Angela Pyle
- 1] Institute of Genetic Medicine, Wellcome Trust Centre for Mitochondrial Research, Newcastle University, Central Parkway, Newcastle upon Tyne NE1 3BZ, UK [2]
| | - Jennifer Munkley
- 1] Institute of Genetic Medicine, Wellcome Trust Centre for Mitochondrial Research, Newcastle University, Central Parkway, Newcastle upon Tyne NE1 3BZ, UK [2]
| | - Talya Dor
- The Monique and Jacques Roboh Department of Genetic Research, Hadassah- Hebrew University Medical Center, Jerusalem 91120, Israel
| | - Jade Quartararo
- Department of Life Sciences, University of Parma, Parco Area delle Scienze 11A, Parma 43124, Italy
| | - Ileana Ferrero
- Department of Life Sciences, University of Parma, Parco Area delle Scienze 11A, Parma 43124, Italy
| | - Veronika Karcagi
- Department of Molecular Genetics and Diagnostics, NIEH, Albert Florian ut 2-6, Budapest 1097, Hungary
| | - Michele Giunta
- Institute of Genetic Medicine, Wellcome Trust Centre for Mitochondrial Research, Newcastle University, Central Parkway, Newcastle upon Tyne NE1 3BZ, UK
| | - Tuomo Polvikoski
- Department of Pathology, Institute for Ageing and Health, Newcastle University, Campus for Ageing and Vitality, Newcastle upon Tyne NE4 5PL, UK
| | - Daniel Birchall
- Neuroradiology Department, Regional Neurosciences Centre, Queen Victoria Road, Newcastle upon Tyne NE1 4PL, UK
| | - Agota Princzinger
- Department of Paediatrics, Josa Andras Hospital, Szent Istvan utca 6, Nyiregyhaza 4400, Hungary
| | - Yuval Cinnamon
- 1] The Monique and Jacques Roboh Department of Genetic Research, Hadassah- Hebrew University Medical Center, Jerusalem 91120, Israel [2] Department of Poultry and Aquaculture Sciences, Institute of Animal Science, Agricultural Research Organization, The Volcani Center, P.O.Box 6, Bet Dagan 50250, Israel
| | - Susanne Lützkendorf
- Department of Neuropediatrics and NeuroCure Clinical Research Center, Charité-Universitätsmedizin, Charité-Platz 1, 10117 Berlin, Germany
| | - Henriett Piko
- Department of Molecular Genetics and Diagnostics, NIEH, Albert Florian ut 2-6, Budapest 1097, Hungary
| | - Mojgan Reza
- Institute of Genetic Medicine, Wellcome Trust Centre for Mitochondrial Research, Newcastle University, Central Parkway, Newcastle upon Tyne NE1 3BZ, UK
| | - Laura Florez
- Western Australian Institute for Medical Research/Centre for Medical Research, The University of Western Australia, 35 Stirling Highway Crawley, Western Australia 6009 Perth, Australia
| | - Mauro Santibanez-Koref
- Institute of Genetic Medicine, Wellcome Trust Centre for Mitochondrial Research, Newcastle University, Central Parkway, Newcastle upon Tyne NE1 3BZ, UK
| | - Helen Griffin
- Institute of Genetic Medicine, Wellcome Trust Centre for Mitochondrial Research, Newcastle University, Central Parkway, Newcastle upon Tyne NE1 3BZ, UK
| | - Markus Schuelke
- Department of Neuropediatrics and NeuroCure Clinical Research Center, Charité-Universitätsmedizin, Charité-Platz 1, 10117 Berlin, Germany
| | - Orly Elpeleg
- The Monique and Jacques Roboh Department of Genetic Research, Hadassah- Hebrew University Medical Center, Jerusalem 91120, Israel
| | - Luba Kalaydjieva
- Western Australian Institute for Medical Research/Centre for Medical Research, The University of Western Australia, 35 Stirling Highway Crawley, Western Australia 6009 Perth, Australia
| | - Hanns Lochmüller
- Institute of Genetic Medicine, Wellcome Trust Centre for Mitochondrial Research, Newcastle University, Central Parkway, Newcastle upon Tyne NE1 3BZ, UK
| | - David J Elliott
- Institute of Genetic Medicine, Wellcome Trust Centre for Mitochondrial Research, Newcastle University, Central Parkway, Newcastle upon Tyne NE1 3BZ, UK
| | - Patrick F Chinnery
- Institute of Genetic Medicine, Wellcome Trust Centre for Mitochondrial Research, Newcastle University, Central Parkway, Newcastle upon Tyne NE1 3BZ, UK
| | - Shimon Edvardson
- The Monique and Jacques Roboh Department of Genetic Research, Hadassah- Hebrew University Medical Center, Jerusalem 91120, Israel
| | - Rita Horvath
- Institute of Genetic Medicine, Wellcome Trust Centre for Mitochondrial Research, Newcastle University, Central Parkway, Newcastle upon Tyne NE1 3BZ, UK
| |
Collapse
|
756
|
Genomic diversity of Epstein-Barr virus genomes isolated from primary nasopharyngeal carcinoma biopsy samples. J Virol 2014; 88:10662-72. [PMID: 24991008 DOI: 10.1128/jvi.01665-14] [Citation(s) in RCA: 81] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
UNLABELLED Undifferentiated nasopharyngeal carcinoma (NPC) has a 100% association with Epstein-Barr virus (EBV). However, only three EBV genomes isolated from NPC patients have been sequenced to date, and the role of EBV genomic variations in the pathogenesis of NPC is unclear. We sought to obtain the sequences of EBV genomes in multiple NPC biopsy specimens in the same geographic location in order to reveal their sequence diversity. Three published EBV (B95-8, C666-1, and HKNPC1) genomes were first resequenced using the sequencing workflow of target enrichment of EBV DNA by hybridization, followed by next-generation sequencing, de novo assembly, and joining of contigs by Sanger sequencing. The sequences of eight NPC biopsy specimen-derived EBV (NPC-EBV) genomes, designated HKNPC2 to HKNPC9, were then determined. They harbored 1,736 variations in total, including 1,601 substitutions, 64 insertions, and 71 deletions, compared to the reference EBV. Furthermore, genes encoding latent, early lytic, and tegument proteins and glycoproteins were found to contain nonsynonymous mutations of potential biological significance. Phylogenetic analysis showed that the HKNPC6 and -7 genomes, which were isolated from tumor biopsy specimens of advanced metastatic NPC cases, were distinct from the other six NPC-EBV genomes, suggesting the presence of at least two parental lineages of EBV among the NPC-EBV genomes. In conclusion, much greater sequence diversity among EBV isolates derived from NPC biopsy specimens is demonstrated on a whole-genome level through a complete sequencing workflow. Large-scale sequencing and comparison of EBV genomes isolated from NPC and normal subjects should be performed to assess whether EBV genomic variations contribute to NPC pathogenesis. IMPORTANCE This study established a sequencing workflow from EBV DNA capture and sequencing to de novo assembly and contig joining. We reported eight newly sequenced EBV genomes isolated from primary NPC biopsy specimens and revealed the sequence diversity on a whole-genome level among these EBV isolates. At least two lineages of EBV strains are observed, and recombination among these lineages is inferred. Our study has demonstrated the value of, and provided a platform for, genome sequencing of EBV.
Collapse
|
757
|
Taylor RW, Pyle A, Griffin H, Blakely EL, Duff J, He L, Smertenko T, Alston CL, Neeve VC, Best A, Yarham JW, Kirschner J, Schara U, Talim B, Topaloglu H, Baric I, Holinski-Feder E, Abicht A, Czermin B, Kleinle S, Morris AA, Vassallo G, Gorman GS, Ramesh V, Turnbull DM, Santibanez-Koref M, McFarland R, Horvath R, Chinnery PF. Use of whole-exome sequencing to determine the genetic basis of multiple mitochondrial respiratory chain complex deficiencies. JAMA 2014; 312:68-77. [PMID: 25058219 PMCID: PMC6558267 DOI: 10.1001/jama.2014.7184] [Citation(s) in RCA: 279] [Impact Index Per Article: 25.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
IMPORTANCE Mitochondrial disorders have emerged as a common cause of inherited disease, but their diagnosis remains challenging. Multiple respiratory chain complex defects are particularly difficult to diagnose at the molecular level because of the massive number of nuclear genes potentially involved in intramitochondrial protein synthesis, with many not yet linked to human disease. OBJECTIVE To determine the molecular basis of multiple respiratory chain complex deficiencies. DESIGN, SETTING, AND PARTICIPANTS We studied 53 patients referred to 2 national centers in the United Kingdom and Germany between 2005 and 2012. All had biochemical evidence of multiple respiratory chain complex defects but no primary pathogenic mitochondrial DNA mutation. Whole-exome sequencing was performed using 62-Mb exome enrichment, followed by variant prioritization using bioinformatic prediction tools, variant validation by Sanger sequencing, and segregation of the variant with the disease phenotype in the family. RESULTS Presumptive causal variants were identified in 28 patients (53%; 95% CI, 39%-67%) and possible causal variants were identified in 4 (8%; 95% CI, 2%-18%). Together these accounted for 32 patients (60% 95% CI, 46%-74%) and involved 18 different genes. These included recurrent mutations in RMND1, AARS2, and MTO1, each on a haplotype background consistent with a shared founder allele, and potential novel mutations in 4 possible mitochondrial disease genes (VARS2, GARS, FLAD1, and PTCD1). Distinguishing clinical features included deafness and renal involvement associated with RMND1 and cardiomyopathy with AARS2 and MTO1. However, atypical clinical features were present in some patients, including normal liver function and Leigh syndrome (subacute necrotizing encephalomyelopathy) seen in association with TRMU mutations and no cardiomyopathy with founder SCO2 mutations. It was not possible to confidently identify the underlying genetic basis in 21 patients (40%; 95% CI, 26%-54%). CONCLUSIONS AND RELEVANCE Exome sequencing enhances the ability to identify potential nuclear gene mutations in patients with biochemically defined defects affecting multiple mitochondrial respiratory chain complexes. Additional study is required in independent patient populations to determine the utility of this approach in comparison with traditional diagnostic methods.
Collapse
Affiliation(s)
- Robert W. Taylor
- Wellcome Trust Centre for Mitochondrial Research, Institute for Ageing and Health, The Medical School, Newcastle University, Newcastle upon Tyne, NE2 4HH, UK
| | - Angela Pyle
- Wellcome Trust Centre for Mitochondrial Research, Institute of Genetic Medicine, Newcastle University, Central Parkway, Newcastle upon Tyne, NE1 3BZ, UK
| | - Helen Griffin
- Wellcome Trust Centre for Mitochondrial Research, Institute of Genetic Medicine, Newcastle University, Central Parkway, Newcastle upon Tyne, NE1 3BZ, UK
| | - Emma L. Blakely
- Wellcome Trust Centre for Mitochondrial Research, Institute for Ageing and Health, The Medical School, Newcastle University, Newcastle upon Tyne, NE2 4HH, UK
| | - Jennifer Duff
- Wellcome Trust Centre for Mitochondrial Research, Institute of Genetic Medicine, Newcastle University, Central Parkway, Newcastle upon Tyne, NE1 3BZ, UK
| | - Langping He
- Wellcome Trust Centre for Mitochondrial Research, Institute for Ageing and Health, The Medical School, Newcastle University, Newcastle upon Tyne, NE2 4HH, UK
| | - Tania Smertenko
- Wellcome Trust Centre for Mitochondrial Research, Institute of Genetic Medicine, Newcastle University, Central Parkway, Newcastle upon Tyne, NE1 3BZ, UK
| | - Charlotte L. Alston
- Wellcome Trust Centre for Mitochondrial Research, Institute for Ageing and Health, The Medical School, Newcastle University, Newcastle upon Tyne, NE2 4HH, UK
| | - Vivienne C. Neeve
- Wellcome Trust Centre for Mitochondrial Research, Institute of Genetic Medicine, Newcastle University, Central Parkway, Newcastle upon Tyne, NE1 3BZ, UK
| | - Andrew Best
- Wellcome Trust Centre for Mitochondrial Research, Institute of Genetic Medicine, Newcastle University, Central Parkway, Newcastle upon Tyne, NE1 3BZ, UK
| | - John W. Yarham
- Wellcome Trust Centre for Mitochondrial Research, Institute for Ageing and Health, The Medical School, Newcastle University, Newcastle upon Tyne, NE2 4HH, UK
| | - Janbernd Kirschner
- Division of Neuropediatrics and Muscle Disorders, University Medical Center Freiburg, Germany
| | - Ulrike Schara
- Department of Neuropediatrics, University of Essen, Essen, Germany
| | - Beril Talim
- Department of Pediatrics, Hacettepe University, Ankara, Turkey
| | - Haluk Topaloglu
- Department of Pediatrics, Hacettepe University, Ankara, Turkey
| | - Ivo Baric
- Department of Paediatrics, University Hospital Center Zagreb & University of Zagreb,School of Medicine, Zagreb, Croatia
| | | | | | | | | | - Andrew A.M. Morris
- Willink Biochemical Genetics Unit, Manchester Centre for Genomic Medicine, Central Manchester University Hospitals NHS Foundation Trust, Manchester, M13 9WL
| | - Grace Vassallo
- Department of Paediatric Neurology, Central Manchester University Hospitals NHS Foundation Trust, Manchester, M13 9WL
| | - Grainne S. Gorman
- Wellcome Trust Centre for Mitochondrial Research, Institute for Ageing and Health, The Medical School, Newcastle University, Newcastle upon Tyne, NE2 4HH, UK
| | - Venkateswaran Ramesh
- Department of Paediatric Neurology, Newcastle upon Tyne Hospitals NHS Trust, Newcastle upon Tyne, NE1 4LP
| | - Douglass M. Turnbull
- Wellcome Trust Centre for Mitochondrial Research, Institute for Ageing and Health, The Medical School, Newcastle University, Newcastle upon Tyne, NE2 4HH, UK
| | - Mauro Santibanez-Koref
- Wellcome Trust Centre for Mitochondrial Research, Institute of Genetic Medicine, Newcastle University, Central Parkway, Newcastle upon Tyne, NE1 3BZ, UK
| | - Robert McFarland
- Wellcome Trust Centre for Mitochondrial Research, Institute for Ageing and Health, The Medical School, Newcastle University, Newcastle upon Tyne, NE2 4HH, UK
- Department of Paediatric Neurology, Newcastle upon Tyne Hospitals NHS Trust, Newcastle upon Tyne, NE1 4LP
| | - Rita Horvath
- Wellcome Trust Centre for Mitochondrial Research, Institute of Genetic Medicine, Newcastle University, Central Parkway, Newcastle upon Tyne, NE1 3BZ, UK
| | - Patrick F. Chinnery
- Wellcome Trust Centre for Mitochondrial Research, Institute of Genetic Medicine, Newcastle University, Central Parkway, Newcastle upon Tyne, NE1 3BZ, UK
| |
Collapse
|
758
|
Jeck WR, Parker J, Carson CC, Shields JM, Sambade MJ, Peters EC, Burd CE, Thomas NE, Chiang DY, Liu W, Eberhard DA, Ollila D, Grilley-Olson J, Moschos S, Neil Hayes D, Sharpless NE. Targeted next generation sequencing identifies clinically actionable mutations in patients with melanoma. Pigment Cell Melanoma Res 2014; 27:653-63. [PMID: 24628946 PMCID: PMC4121659 DOI: 10.1111/pcmr.12238] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2013] [Revised: 01/08/2014] [Indexed: 12/30/2022]
Abstract
Somatic sequencing of cancers has produced new insight into tumorigenesis, tumor heterogeneity, and disease progression, but the vast majority of genetic events identified are of indeterminate clinical significance. Here, we describe a NextGen sequencing approach to fully analyzing 248 genes, including all those of known clinical significance in melanoma. This strategy features solution capture of DNA followed by multiplexed, high-throughput sequencing and was evaluated in 31 melanoma cell lines and 18 tumor tissues from patients with metastatic melanoma. Mutations in melanoma cell lines correlated with their sensitivity to corresponding small molecule inhibitors, confirming, for example, lapatinib sensitivity in ERBB4 mutant lines and identifying a novel activating mutation of BRAF. The latter event would not have been identified by clinical sequencing and was associated with responsiveness to a BRAF kinase inhibitor. This approach identified focal copy number changes of PTEN not found by standard methods, such as comparative genomic hybridization (CGH). Actionable mutations were found in 89% of the tumor tissues analyzed, 56% of which would not be identified by standard-of-care approaches. This work shows that targeted sequencing is an attractive approach for clinical use in melanoma.
Collapse
Affiliation(s)
- William R Jeck
- Department of Genetics, University of North Carolina School of Medicine, Chapel Hill, NC, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
759
|
Dheilly NM, Adema C, Raftos DA, Gourbal B, Grunau C, Du Pasquier L. No more non-model species: the promise of next generation sequencing for comparative immunology. DEVELOPMENTAL AND COMPARATIVE IMMUNOLOGY 2014; 45:56-66. [PMID: 24508980 PMCID: PMC4096995 DOI: 10.1016/j.dci.2014.01.022] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/04/2013] [Revised: 01/20/2014] [Accepted: 01/21/2014] [Indexed: 05/21/2023]
Abstract
Next generation sequencing (NGS) allows for the rapid, comprehensive and cost effective analysis of entire genomes and transcriptomes. NGS provides approaches for immune response gene discovery, profiling gene expression over the course of parasitosis, studying mechanisms of diversification of immune receptors and investigating the role of epigenetic mechanisms in regulating immune gene expression and/or diversification. NGS will allow meaningful comparisons to be made between organisms from different taxa in an effort to understand the selection of diverse strategies for host defence under different environmental pathogen pressures. At the same time, it will reveal the shared and unique components of the immunological toolkit and basic functional aspects that are essential for immune defence throughout the living world. In this review, we argue that NGS will revolutionize our understanding of immune responses throughout the animal kingdom because the depth of information it provides will circumvent the need to concentrate on a few "model" species.
Collapse
Affiliation(s)
- Nolwenn M Dheilly
- CNRS, UMR 5244, Ecologie et Evolution des Interactions (2EI), Perpignan F-66860, France; Université de Perpignan Via Domitia, Perpignan F-66860, France.
| | - Coen Adema
- Center for Evolutionary and Theoretical Immunology, Biology Department, University of New Mexico, Albuquerque, NM 87131, USA
| | - David A Raftos
- Department of Biological Sciences, Macquarie University, North Ryde, NSW 2109, Australia
| | - Benjamin Gourbal
- CNRS, UMR 5244, Ecologie et Evolution des Interactions (2EI), Perpignan F-66860, France; Université de Perpignan Via Domitia, Perpignan F-66860, France
| | - Christoph Grunau
- CNRS, UMR 5244, Ecologie et Evolution des Interactions (2EI), Perpignan F-66860, France; Université de Perpignan Via Domitia, Perpignan F-66860, France
| | - Louis Du Pasquier
- University of Basel, Institute of Zoology and Evolutionary Biology, Basel, Switzerland
| |
Collapse
|
760
|
Shao W, Kearney MF, Boltz VF, Spindler JE, Mellors JW, Maldarelli F, Coffin JM. PAPNC, a novel method to calculate nucleotide diversity from large scale next generation sequencing data. J Virol Methods 2014; 203:73-80. [PMID: 24681054 PMCID: PMC4104926 DOI: 10.1016/j.jviromet.2014.03.008] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2014] [Revised: 03/10/2014] [Accepted: 03/11/2014] [Indexed: 02/06/2023]
Abstract
Estimating viral diversity in infected patients can provide insight into pathogen evolution and emergence of drug resistance. With the widespread adoption of deep sequencing, it is important to develop tools to accurately calculate population diversity from very large datasets. Current methods for estimating diversity that are based on multiple alignments are not practical to apply to such data. In this study, the authors report a novel method (Pairwise Alignment Positional Nucleotide Counting, PAPNC) for estimating population diversity from 454 sequence data. The diversity measurements determined using this method were comparable to those calculated by average pairwise difference (APD) of multiply aligned sequences using MEGA5. Diversities were estimated for 9 patient plasma HIV samples sequenced with Titanium 454 technology and by single-genome sequencing (SGS). Diversities calculated from deep sequencing using PAPNC ranged from 0.002 to 0.021 while APD measurements calculated from SGS data ranged proximately from 0.001 to 0.018, with the difference being attributable to PCR error (contributing background diversity of 0.0016 in a control sample). Comparison of APDs estimated from 100 sets of sequences drawn at random from 454 generated data and from corresponding SGS data showed very close correlation between the two methods with R(2) of 0.96, and differing on average by about 1% (after correction for PCR error). The authors have developed a novel method that is good for calculating genetic diversities for large scale datasets from next generation sequencing. It can be implemented easily as a function in available variation calling programs like SAMtools or haplotype reconstruction software for nucleotide genetic diversity calculation. A Perl script implementing this method is available upon request.
Collapse
Affiliation(s)
- Wei Shao
- Advanced Biomedical Computing Center, Leidos Biomedical Research, Inc., Frederick National Laboratory for Cancer Research, Frederick, MD, United States.
| | - Mary F Kearney
- HIV Drug Resistance Program, NCI, Frederick, MD, United States
| | - Valerie F Boltz
- HIV Drug Resistance Program, NCI, Frederick, MD, United States
| | | | - John W Mellors
- Division of Infectious Diseases, University of Pittsburgh, Pittsburgh, PA, United States
| | | | - John M Coffin
- Department of Molecular Biology and Microbiology, Tufts University, Boston, MA, United States
| |
Collapse
|
761
|
Petrini I, Meltzer PS, Kim IK, Lucchi M, Park KS, Fontanini G, Gao J, Zucali PA, Calabrese F, Favaretto A, Rea F, Rodriguez-Canales J, Walker RL, Pineda M, Zhu YJ, Lau C, Killian KJ, Bilke S, Voeller D, Dakshanamurthy S, Wang Y, Giaccone G. A specific missense mutation in GTF2I occurs at high frequency in thymic epithelial tumors. Nat Genet 2014; 46:844-9. [PMID: 24974848 DOI: 10.1038/ng.3016] [Citation(s) in RCA: 188] [Impact Index Per Article: 17.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2014] [Accepted: 06/02/2014] [Indexed: 12/15/2022]
Abstract
We analyzed 28 thymic epithelial tumors (TETs) using next-generation sequencing and identified a missense mutation (chromosome 7 c.74146970T>A) in GTF2I at high frequency in type A thymomas, a relatively indolent subtype. In a series of 274 TETs, we detected the GTF2I mutation in 82% of type A and 74% of type AB thymomas but rarely in the aggressive subtypes, where recurrent mutations of known cancer genes have been identified. Therefore, GTF2I mutation correlated with better survival. GTF2I β and δ isoforms were expressed in TETs, and both mutant isoforms were able to stimulate cell proliferation in vitro. Thymic carcinomas carried a higher number of mutations than thymomas (average of 43.5 and 18.4, respectively). Notably, we identified recurrent mutations of known cancer genes, including TP53, CYLD, CDKN2A, BAP1 and PBRM1, in thymic carcinomas. These findings will complement the diagnostic assessment of these tumors and also facilitate development of a molecular classification and assessment of prognosis and treatment strategies.
Collapse
Affiliation(s)
- Iacopo Petrini
- Medical Oncology Branch, National Cancer Institute, Bethesda, Maryland, USA
| | - Paul S Meltzer
- Genetics Branch, Center for Cancer Research, National Cancer Institute, Bethesda, Maryland, USA
| | - In-Kyu Kim
- Lombardi Comprehensive Cancer Center, Georgetown University, Washington DC, USA
| | - Marco Lucchi
- Thoracic Surgery, Pisa University Hospital, Pisa, Italy
| | - Kang-Seo Park
- Lombardi Comprehensive Cancer Center, Georgetown University, Washington DC, USA
| | | | - James Gao
- Medical Oncology Branch, National Cancer Institute, Bethesda, Maryland, USA
| | - Paolo A Zucali
- Medical Oncology, Humanitas Clinical and Research Center, Rozzano, Milan, Italy
| | | | | | - Federico Rea
- Thoracic Surgery, Padua University Hospital, Padua, Italy
| | | | - Robert L Walker
- Genetics Branch, Center for Cancer Research, National Cancer Institute, Bethesda, Maryland, USA
| | - Marbin Pineda
- Genetics Branch, Center for Cancer Research, National Cancer Institute, Bethesda, Maryland, USA
| | - Yuelin J Zhu
- Genetics Branch, Center for Cancer Research, National Cancer Institute, Bethesda, Maryland, USA
| | - Christopher Lau
- Genetics Branch, Center for Cancer Research, National Cancer Institute, Bethesda, Maryland, USA
| | - Keith J Killian
- Genetics Branch, Center for Cancer Research, National Cancer Institute, Bethesda, Maryland, USA
| | - Sven Bilke
- Genetics Branch, Center for Cancer Research, National Cancer Institute, Bethesda, Maryland, USA
| | - Donna Voeller
- Medical Oncology Branch, National Cancer Institute, Bethesda, Maryland, USA
| | | | - Yisong Wang
- 1] Medical Oncology Branch, National Cancer Institute, Bethesda, Maryland, USA. [2] Lombardi Comprehensive Cancer Center, Georgetown University, Washington DC, USA
| | - Giuseppe Giaccone
- 1] Medical Oncology Branch, National Cancer Institute, Bethesda, Maryland, USA. [2] Lombardi Comprehensive Cancer Center, Georgetown University, Washington DC, USA
| |
Collapse
|
762
|
Single nucleotide polymorphisms associated with colorectal cancer susceptibility and loss of heterozygosity in a Taiwanese population. PLoS One 2014; 9:e100060. [PMID: 24968322 PMCID: PMC4072675 DOI: 10.1371/journal.pone.0100060] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2013] [Accepted: 05/22/2014] [Indexed: 01/01/2023] Open
Abstract
Given the significant racial and ethnic diversity in genetic variation, we are intrigued to find out whether the single nucleotide polymorphisms (SNPs) identified in genome-wide association studies of colorectal cancer (CRC) susceptibility in East Asian populations are also relevant to the population of Taiwan. Moreover, loss of heterozygosity (LOH) may provide insight into how variants alter CRC risk and how regulatory elements control gene expression. To investigate the racial and ethnic diversity of CRC-susceptibility genetic variants and their relevance to the Taiwanese population, we genotyped 705 CRC cases and 1,802 healthy controls (Taiwan Biobank) for fifteen previously reported East Asian CRC-susceptibility SNPs and four novel genetic variants identified by whole-exome sequencing. We found that rs10795668 in FLJ3802842 and rs4631962 in CCND2 were significantly associated with CRC risk in the Taiwanese population. The previously unreported rs1338565 was associated with a significant increased risk of CRC. In addition, we also genotyped tumor tissue and paired adjacent normal tissues of these 705 CRC cases to search for LOH, as well as risk-associated and protective alleles. LOH analysis revealed preferential retention of three SNPs, rs12657484, rs3802842, and rs4444235, in tumor tissues. rs4444235 has been recently reported to be a cis-acting regulator of BMP4 gene; in this study, the C allele was preferentially retained in tumor tissues (p = 0.0023). rs4631962 and rs10795668 contribute to CRC risk in the Taiwanese and East Asian populations, and the newly identified rs1338565 was specifically associated with CRC, supporting the ethnic diversity of CRC-susceptibility SNPs. LOH analysis suggested that the three CRC risk variants, rs12657484, rs3802842, and rs4444235, exhibited somatic allele-specific imbalance and might be critical during neoplastic progression.
Collapse
|
763
|
Wilkerson MD, Cabanski CR, Sun W, Hoadley KA, Walter V, Mose LE, Troester MA, Hammerman PS, Parker JS, Perou CM, Hayes DN. Integrated RNA and DNA sequencing improves mutation detection in low purity tumors. Nucleic Acids Res 2014; 42:e107. [PMID: 24970867 PMCID: PMC4117748 DOI: 10.1093/nar/gku489] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Identifying somatic mutations is critical for cancer genome characterization and for prioritizing patient treatment. DNA whole exome sequencing (DNA-WES) is currently the most popular technology; however, this yields low sensitivity in low purity tumors. RNA sequencing (RNA-seq) covers the expressed exome with depth proportional to expression. We hypothesized that integrating DNA-WES and RNA-seq would enable superior mutation detection versus DNA-WES alone. We developed a first-of-its-kind method, called UNCeqR, that detects somatic mutations by integrating patient-matched RNA-seq and DNA-WES. In simulation, the integrated DNA and RNA model outperformed the DNA-WES only model. Validation by patient-matched whole genome sequencing demonstrated superior performance of the integrated model over DNA-WES only models, including a published method and published mutation profiles. Genome-wide mutational analysis of breast and lung cancer cohorts (n = 871) revealed remarkable tumor genomics properties. Low purity tumors experienced the largest gains in mutation detection by integrating RNA-seq and DNA-WES. RNA provided greater mutation signal than DNA in expressed mutations. Compared to earlier studies on this cohort, UNCeqR increased mutation rates of driver and therapeutically targeted genes (e.g. PIK3CA, ERBB2 and FGFR2). In summary, integrating RNA-seq with DNA-WES increases mutation detection performance, especially for low purity tumors.
Collapse
Affiliation(s)
- Matthew D Wilkerson
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Christopher R Cabanski
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA The Genome Institute at Washington University, St. Louis, MO 63108, USA
| | - Wei Sun
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Katherine A Hoadley
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Vonn Walter
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Lisle E Mose
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Melissa A Troester
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Peter S Hammerman
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA
| | - Joel S Parker
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Charles M Perou
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - D Neil Hayes
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA Department of Internal Medicine, Division of Medical Oncology, Multidisciplinary Thoracic Oncology Program, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| |
Collapse
|
764
|
Thavamanikumar S, Southerton S, Thumma B. RNA-Seq using two populations reveals genes and alleles controlling wood traits and growth in Eucalyptus nitens. PLoS One 2014; 9:e101104. [PMID: 24967893 PMCID: PMC4072731 DOI: 10.1371/journal.pone.0101104] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2014] [Accepted: 06/02/2014] [Indexed: 11/17/2022] Open
Abstract
Eucalyptus nitens is a perennial forest tree species grown mainly for kraft pulp production in many parts of the world. Kraft pulp yield (KPY) is a key determinant of plantation profitability and increasing the KPY of trees grown in plantations is a major breeding objective. To speed up the breeding process, molecular markers that can predict KPY are desirable. To achieve this goal, we carried out RNA-Seq studies on trees at extremes of KPY in two different trials to identify genes and alleles whose expression correlated with KPY. KPY is positively correlated with growth measured as diameter at breast height (DBH) in both trials. In total, six RNA bulks from two treatments were sequenced on an Illumina HiSeq platform. At 5% false discovery rate level, 3953 transcripts showed differential expression in the same direction in both trials; 2551 (65%) were down-regulated and 1402 (35%) were up-regulated in low KPY samples. The genes up-regulated in low KPY trees were largely involved in biotic and abiotic stress response reflecting the low growth among low KPY trees. Genes down-regulated in low KPY trees mainly belonged to gene categories involved in wood formation and growth. Differential allelic expression was observed in 2103 SNPs (in 1068 genes) and of these 640 SNPs (30%) occurred in 313 unique genes that were also differentially expressed. These SNPs may represent the cis-acting regulatory variants that influence total gene expression. In addition we also identified 196 genes which had Ka/Ks ratios greater than 1.5, suggesting that these genes are under positive selection. Candidate genes and alleles identified in this study will provide a valuable resource for future association studies aimed at identifying molecular markers for KPY and growth.
Collapse
Affiliation(s)
- Saravanan Thavamanikumar
- Department of Forest and Ecosystem Science, University of Melbourne, Creswick, Victoria, Australia
| | | | - Bala Thumma
- CSIRO Plant Industry, Acton, ACT, Australia
- * E-mail:
| |
Collapse
|
765
|
Hwang KB, Lee IH, Park JH, Hambuch T, Choe Y, Kim M, Lee K, Song T, Neu MB, Gupta N, Kohane IS, Green RC, Kong SW. Reducing false-positive incidental findings with ensemble genotyping and logistic regression based variant filtering methods. Hum Mutat 2014; 35:936-44. [PMID: 24829188 DOI: 10.1002/humu.22587] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2013] [Accepted: 04/29/2014] [Indexed: 12/29/2022]
Abstract
As whole genome sequencing (WGS) uncovers variants associated with rare and common diseases, an immediate challenge is to minimize false-positive findings due to sequencing and variant calling errors. False positives can be reduced by combining results from orthogonal sequencing methods, but costly. Here, we present variant filtering approaches using logistic regression (LR) and ensemble genotyping to minimize false positives without sacrificing sensitivity. We evaluated the methods using paired WGS datasets of an extended family prepared using two sequencing platforms and a validated set of variants in NA12878. Using LR or ensemble genotyping based filtering, false-negative rates were significantly reduced by 1.1- to 17.8-fold at the same levels of false discovery rates (5.4% for heterozygous and 4.5% for homozygous single nucleotide variants (SNVs); 30.0% for heterozygous and 18.7% for homozygous insertions; 25.2% for heterozygous and 16.6% for homozygous deletions) compared to the filtering based on genotype quality scores. Moreover, ensemble genotyping excluded > 98% (105,080 of 107,167) of false positives while retaining > 95% (897 of 937) of true positives in de novo mutation (DNM) discovery in NA12878, and performed better than a consensus method using two sequencing platforms. Our proposed methods were effective in prioritizing phenotype-associated variants, and an ensemble genotyping would be essential to minimize false-positive DNM candidates.
Collapse
Affiliation(s)
- Kyu-Baek Hwang
- Children's Hospital Informatics Program at the Harvard-MIT Division of Health Sciences and Technology, Boston Children's Hospital, Boston, Massachusetts; School of Computer Science and Engineering, Soongsil University, Seoul, 156-743, South Korea
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
766
|
Novel method for analysis of allele specific expression in triploid Oryzias latipes reveals consistent pattern of allele exclusion. PLoS One 2014; 9:e100250. [PMID: 24945156 PMCID: PMC4063754 DOI: 10.1371/journal.pone.0100250] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2014] [Accepted: 05/22/2014] [Indexed: 11/30/2022] Open
Abstract
Assessing allele-specific gene expression (ASE) on a large scale continues to be a technically challenging problem. Certain biological phenomena, such as X chromosome inactivation and parental imprinting, affect ASE most drastically by completely shutting down the expression of a whole set of alleles. Other more subtle effects on ASE are likely to be much more complex and dependent on the genetic environment and are perhaps more important to understand since they may be responsible for a significant amount of biological diversity. Tools to assess ASE in a diploid biological system are becoming more reliable. Non-diploid systems are, however, not uncommon. In humans full or partial polyploid states are regularly found in both healthy (meiotic cells, polynucleated cell types) and diseased tissues (trisomies, non-disjunction events, cancerous tissues). In this work we have studied ASE in the medaka fish model system. We have developed a method for determining ASE in polyploid organisms from RNAseq data and we have implemented this method in a software tool set. As a biological model system we have used nuclear transplantation to experimentally produce artificial triploid medaka composed of three different haplomes. We measured ASE in RNA isolated from the livers of two adult, triploid medaka fish that showed a high degree of similarity. The majority of genes examined (82%) shared expression more or less evenly among the three alleles in both triploids. The rest of the genes (18%) displayed a wide range of ASE levels. Interestingly the majority of genes (78%) displayed generally consistent ASE levels in both triploid individuals. A large contingent of these genes had the same allele entirely suppressed in both triploids. When viewed in a chromosomal context, it is revealed that these genes are from large sections of 4 chromosomes and may be indicative of some broad scale suppression of gene expression.
Collapse
|
767
|
Yarham JW, Lamichhane TN, Pyle A, Mattijssen S, Baruffini E, Bruni F, Donnini C, Vassilev A, He L, Blakely EL, Griffin H, Santibanez-Koref M, Bindoff LA, Ferrero I, Chinnery PF, McFarland R, Maraia RJ, Taylor RW. Defective i6A37 modification of mitochondrial and cytosolic tRNAs results from pathogenic mutations in TRIT1 and its substrate tRNA. PLoS Genet 2014; 10:e1004424. [PMID: 24901367 PMCID: PMC4046958 DOI: 10.1371/journal.pgen.1004424] [Citation(s) in RCA: 103] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2013] [Accepted: 04/20/2014] [Indexed: 01/10/2023] Open
Abstract
Identifying the genetic basis for mitochondrial diseases is technically challenging given the size of the mitochondrial proteome and the heterogeneity of disease presentations. Using next-generation exome sequencing, we identified in a patient with severe combined mitochondrial respiratory chain defects and corresponding perturbation in mitochondrial protein synthesis, a homozygous p.Arg323Gln mutation in TRIT1. This gene encodes human tRNA isopentenyltransferase, which is responsible for i6A37 modification of the anticodon loops of a small subset of cytosolic and mitochondrial tRNAs. Deficiency of i6A37 was previously shown in yeast to decrease translational efficiency and fidelity in a codon-specific manner. Modelling of the p.Arg323Gln mutation on the co-crystal structure of the homologous yeast isopentenyltransferase bound to a substrate tRNA, indicates that it is one of a series of adjacent basic side chains that interact with the tRNA backbone of the anticodon stem, somewhat removed from the catalytic center. We show that patient cells bearing the p.Arg323Gln TRIT1 mutation are severely deficient in i6A37 in both cytosolic and mitochondrial tRNAs. Complete complementation of the i6A37 deficiency of both cytosolic and mitochondrial tRNAs was achieved by transduction of patient fibroblasts with wild-type TRIT1. Moreover, we show that a previously-reported pathogenic m.7480A>G mt-tRNASer(UCN) mutation in the anticodon loop sequence A36A37A38 recognised by TRIT1 causes a loss of i6A37 modification. These data demonstrate that deficiencies of i6A37 tRNA modification should be considered a potential mechanism of human disease caused by both nuclear gene and mitochondrial DNA mutations while providing insight into the structure and function of TRIT1 in the modification of cytosolic and mitochondrial tRNAs. Mitochondrial disorders are clinically diverse, and identifying the underlying genetic mutations is technically challenging due to the large number of mitochondrial proteins. Using high-throughput sequencing technology, we identified a disease-causing mutation in the TRIT1 gene. This gene encodes an enzyme, tRNA isopentenyltransferase, that adds an N6-isopentenyl modification to adenosine-37 (i6A37) in a small number of tRNAs, enabling them to function correctly during the synthesis of essential mitochondrial proteins. We show that this mutation leads to severe deficiency of tRNA-i6A37 in the patient's cells that can be rescued by introduction of the wild-type TRIT1 protein. A deficiency in oxidative phosphorylation, the process by which energy (ATP) is generated in the mitochondria, leads to a mitochondrial disease presentation. Introducing the mutant protein into model yeast species and measuring the resulting impairment provided further evidence of the pathogenic effect of the mutation. Additional studies investigating a previously reported pathogenic mutation in a mitochondrial tRNA gene demonstrated that a mutation in a substrate of TRIT1 can also cause a loss of the modification, providing evidence of a new mechanism causing mitochondrial disease in humans.
Collapse
Affiliation(s)
- John W. Yarham
- Wellcome Trust Centre for Mitochondrial Research, Institute for Ageing and Health, The Medical School, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Tek N. Lamichhane
- Intramural Research Program, NICHD, NIH, Bethesda, Maryland, United States of America
| | - Angela Pyle
- Wellcome Trust Centre for Mitochondrial Research, Institute for Genetic Medicine, The Medical School, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Sandy Mattijssen
- Intramural Research Program, NICHD, NIH, Bethesda, Maryland, United States of America
| | | | - Francesco Bruni
- Wellcome Trust Centre for Mitochondrial Research, Institute for Ageing and Health, The Medical School, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Claudia Donnini
- Department of Life Sciences, University of Parma, Parma, Italy
| | - Alex Vassilev
- Intramural Research Program, NICHD, NIH, Bethesda, Maryland, United States of America
| | - Langping He
- Wellcome Trust Centre for Mitochondrial Research, Institute for Ageing and Health, The Medical School, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Emma L. Blakely
- Wellcome Trust Centre for Mitochondrial Research, Institute for Ageing and Health, The Medical School, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Helen Griffin
- Wellcome Trust Centre for Mitochondrial Research, Institute for Genetic Medicine, The Medical School, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Mauro Santibanez-Koref
- Wellcome Trust Centre for Mitochondrial Research, Institute for Genetic Medicine, The Medical School, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Laurence A. Bindoff
- Department of Neurology, Haukeland University Hospital, Bergen, Norway
- Department of Clinical Medicine, University of Bergen, Bergen, Norway
| | - Ileana Ferrero
- Department of Life Sciences, University of Parma, Parma, Italy
| | - Patrick F. Chinnery
- Wellcome Trust Centre for Mitochondrial Research, Institute for Genetic Medicine, The Medical School, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Robert McFarland
- Wellcome Trust Centre for Mitochondrial Research, Institute for Ageing and Health, The Medical School, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Richard J. Maraia
- Intramural Research Program, NICHD, NIH, Bethesda, Maryland, United States of America
- * E-mail: (RJM) (RM); (RWT) (RT)
| | - Robert W. Taylor
- Wellcome Trust Centre for Mitochondrial Research, Institute for Ageing and Health, The Medical School, Newcastle University, Newcastle upon Tyne, United Kingdom
- * E-mail: (RJM) (RM); (RWT) (RT)
| |
Collapse
|
768
|
Gardner K, Payne BAI, Horvath R, Chinnery PF. Use of stereotypical mutational motifs to define resolution limits for the ultra-deep resequencing of mitochondrial DNA. Eur J Hum Genet 2014; 23:413-5. [PMID: 24896153 PMCID: PMC4326723 DOI: 10.1038/ejhg.2014.96] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2014] [Revised: 04/07/2014] [Accepted: 04/24/2014] [Indexed: 11/22/2022] Open
Abstract
Massively parallel resequencing of mitochondrial DNA (mtDNA) has led to significant advances in the study of heteroplasmic mtDNA variants in health and disease, but confident resolution of very low-level variants (<2% heteroplasmy) remains challenging due to the difficulty in distinguishing signal from noise at this depth. However, it is likely that such variants are precisely those of greatest interest in the study of somatic (acquired) mtDNA mutations. Previous approaches to this issue have included the use of controls such as phage DNA and mtDNA clones, both of which may not accurately recapitulate natural mtDNA. We have therefore explored a novel approach, taking advantage of mtDNA with a known stereotyped mutational motif (nAT>C, from patient with MNGIE, mitochondrial neurogastrointestinal encephalomyopathy) and comparing mutational pattern distribution with healthy mtDNA by ligation-mediated deep resequencing (Applied Biosystems SOLiD). We empirically derived mtDNA-mutant heteroplasmy detection limits, demonstrating that the presence of stereotypical mutational motif could be statistically validated for heteroplasmy thresholds ≥0.22% (P=0.034). We therefore provide empirical evidence from biological samples that very low-level mtDNA mutants can be meaningfully resolved by massively parallel resequencing, confirming the utility of the approach for studying somatic mtDNA mutation in health and disease. Our approach could also usefully be employed in other settings to derive platform-specific deep resequencing resolution limits.
Collapse
Affiliation(s)
- Kristian Gardner
- Mitochondrial Research Group, Institute of Genetic Medicine, Newcastle University, Newcastle-upon-Tyne, UK
| | - Brendan A I Payne
- Mitochondrial Research Group, Institute of Genetic Medicine, Newcastle University, Newcastle-upon-Tyne, UK
| | - Rita Horvath
- Mitochondrial Research Group, Institute of Genetic Medicine, Newcastle University, Newcastle-upon-Tyne, UK
| | - Patrick F Chinnery
- Mitochondrial Research Group, Institute of Genetic Medicine, Newcastle University, Newcastle-upon-Tyne, UK
| |
Collapse
|
769
|
Teer JK. An improved understanding of cancer genomics through massively parallel sequencing. Transl Cancer Res 2014; 3:243-259. [PMID: 26146607 PMCID: PMC4486294 DOI: 10.3978/j.issn.2218-676x.2014.05.05] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
DNA sequencing technology advances have enabled genetic investigation of more samples in a shorter time than has previously been possible. Furthermore, the ability to analyze and understand large sequencing datasets has improved due to concurrent advances in sequence data analysis methods and software tools. Constant improvements to both technology and analytic approaches in this fast moving field are evidenced by many recent publications of computational methods, as well as biological results linking genetic events to human disease. Cancer in particular has been the subject of intense investigation, owing to the genetic underpinnings of this complex collection of diseases. New massively-parallel sequencing (MPS) technologies have enabled the investigation of thousands of samples, divided across tens of different tumor types, resulting in new driver gene identification, mutagenic pattern characterization, and other newly uncovered features of tumor biology. This review will focus both on methods and recent results: current analytical approaches to DNA and RNA sequencing will be presented followed by a review of recent pan-cancer sequencing studies. This overview of methods and results will not only highlight the recent advances in cancer genomics, but also the methods and tools used to accomplish these advancements in a constantly and rapidly improving field.
Collapse
Affiliation(s)
- Jamie K Teer
- , H. Lee Moffitt Cancer Center and Research Institute, 12902 Magnolia Dr., Tampa, FL 33612, Tel: 813-745-2650
| |
Collapse
|
770
|
Kim K, Ban HJ, Seo J, Lee K, Yavartanoo M, Kim SC, Park K, Cho SB, Choi JK. Genetic factors underlying discordance in chromatin accessibility between monozygotic twins. Genome Biol 2014; 15:R72. [PMID: 24887574 PMCID: PMC4072931 DOI: 10.1186/gb-2014-15-5-r72] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2013] [Accepted: 05/29/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Open chromatin is implicated in regulatory processes; thus, variations in chromatin structure may contribute to variations in gene expression and other phenotypes. In this work, we perform targeted deep sequencing for open chromatin, and array-based genotyping across the genomes of 72 monozygotic twins to identify genetic factors regulating co-twin discordance in chromatin accessibility. RESULTS We show that somatic mutations cause chromatin discordance mainly via the disruption of transcription factor binding sites. Structural changes in DNA due to C:G to A:T transversions are under purifying selection due to a strong impact on chromatin accessibility. We show that CpGs whose methylation is specifically regulated during cellular differentiation appear to be protected from high mutation rates of 5'-methylcytosines, suggesting that the spectrum of CpG variations may be shaped fully at the developmental level but not through natural selection. Based on the association mapping of within-pair chromatin differences, we search for cases in which twin siblings with a particular genotype had chromatin discordance at the relevant locus. We identify 1,325 chromatin sites that are differentially accessible, depending on the genotype of a nearby locus, suggesting that epigenetic differences can control regulatory variations via interactions with genetic factors. Poised promoters present high levels of chromatin discordance in association with either somatic mutations or genetic-epigenetic interactions. CONCLUSION Our observations illustrate how somatic mutations and genetic polymorphisms may contribute to regulatory, and ultimately phenotypic, discordance.
Collapse
|
771
|
Whole-genome sequencing analysis identifies a distinctive mutational spectrum in an arsenic-related lung tumor. J Thorac Oncol 2014; 8:1451-5. [PMID: 24128716 DOI: 10.1097/jto.0b013e3182a4dd8e] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
INTRODUCTION Arsenic exposure is a significant cause of lung cancer in North America and worldwide. Arsenic-related tumors are structurally indistinguishable from those induced by other carcinogens. Because carcinogens, like tobacco, induce distinctive mutational signatures, we sought to characterize the mutational signature of an arsenic-related lung tumor from a never smoker with the use of whole-genome sequencing. METHODS Tumor and lung tissues were obtained from a never smoker with lung squamous cell carcinoma (LUSC), without familiar history of lung cancer and chronically exposed to high levels of arsenic-contaminated drinking water. The Illumina HiSeq-2000 platform was used to sequence each genome at approximately 30-fold haploid coverage. The mutational signature was compared with those observed in previously characterized lung tumors. RESULTS The arsenic-related tumor exhibited alterations common in LUSC, such as the increased number of copies at 3q26 (SOX2 locus). However, the arsenic-related genome not only harbored a lower number of point mutations, but also had a remarkably high fraction of T>G/A>C mutations and low fraction of C>A/G>T transversions, which is uncharacteristic of LUSCs. Furthermore, at the gene level, we identified a rare G>C mutation in TP53, which is uncommon in lung tumors in general (<0.2%) but has been observed in other arsenic-related malignancies. CONCLUSIONS We generated the first whole-genome sequence of an LUSC from a never-smoker patient chronically exposed to arsenic, and identified a distinct mutational spectrum associated with arsenic exposure, providing novel evidence supporting the hypothesis that arsenic-induced lung tumors arise through molecular mechanisms that differ from those of the common lung cancer.
Collapse
|
772
|
Quinn A, Juneja P, Jiggins FM. Estimates of allele-specific expression in Drosophila with a single genome sequence and RNA-seq data. ACTA ACUST UNITED AC 2014; 30:2603-10. [PMID: 24845654 DOI: 10.1093/bioinformatics/btu342] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
Abstract
MOTIVATION Genetic variation in cis-regulatory elements is an important cause of variation in gene expression. Cis-regulatory variation can be detected by using high-throughput RNA sequencing (RNA-seq) to identify differences in the expression of the two alleles of a gene. This requires that reads from the two alleles are equally likely to map to a reference genome(s), and that single-nucleotide polymorphisms (SNPs) are accurately called, so that reads derived from the different alleles can be identified. Both of these prerequisites can be achieved by sequencing the genomes of the parents of the individual being studied, but this is often prohibitively costly. RESULTS In Drosophila, we demonstrate that biases during read mapping can be avoided by mapping reads to two alternative genomes that incorporate SNPs called from the RNA-seq data. The SNPs can be reliably called from the RNA-seq data itself, provided any variants not found in high-quality SNP databases are filtered out. Finally, we suggest a way of measuring allele-specific expression (ASE) by crossing the line of interest to a reference line with a high-quality genome sequence. Combined with our bioinformatic methods, this approach minimizes mapping biases, allows poor-quality data to be identified and removed and aides in the biological interpretation of the data as the parent of origin of each allele is known. In conclusion, our results suggest that accurate estimates of ASE do not require the parental genomes of the individual being studied to be sequenced. AVAILABILITY AND IMPLEMENTATION Scripts used to perform our analysis are available at https://github.com/d-quinn/bio_quinn2013.
Collapse
Affiliation(s)
- Andrew Quinn
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, UK
| | - Punita Juneja
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, UK
| | - Francis M Jiggins
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, UK
| |
Collapse
|
773
|
Moghaddam SM, Song Q, Mamidi S, Schmutz J, Lee R, Cregan P, Osorno JM, McClean PE. Developing market class specific InDel markers from next generation sequence data in Phaseolus vulgaris L. FRONTIERS IN PLANT SCIENCE 2014; 5:185. [PMID: 24860578 PMCID: PMC4026720 DOI: 10.3389/fpls.2014.00185] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/09/2013] [Accepted: 04/19/2014] [Indexed: 05/09/2023]
Abstract
Next generation sequence data provides valuable information and tools for genetic and genomic research and offers new insights useful for marker development. This data is useful for the design of accurate and user-friendly molecular tools. Common bean (Phaseolus vulgaris L.) is a diverse crop in which separate domestication events happened in each gene pool followed by race and market class diversification that has resulted in different morphological characteristics in each commercial market class. This has led to essentially independent breeding programs within each market class which in turn has resulted in limited within market class sequence variation. Sequence data from selected genotypes of five bean market classes (pinto, black, navy, and light and dark red kidney) were used to develop InDel-based markers specific to each market class. Design of the InDel markers was conducted through a combination of assembly, alignment and primer design software using 1.6× to 5.1× coverage of Illumina GAII sequence data for each of the selected genotypes. The procedure we developed for primer design is fast, accurate, less error prone, and higher throughput than when they are designed manually. All InDel markers are easy to run and score with no need for PCR optimization. A total of 2687 InDel markers distributed across the genome were developed. To highlight their usefulness, they were employed to construct a phylogenetic tree and a genetic map, showing that InDel markers are reliable, simple, and accurate.
Collapse
Affiliation(s)
- Samira Mafi Moghaddam
- Genomics and Bioinformatics Program, North Dakota State UniversityFargo, ND, USA
- Department of Plant Sciences, North Dakota State UniversityFargo, ND, USA
| | - Qijian Song
- Soybean Genomics and Improvement Laboratory, United States Department of Agriculture, Agricultural Research ServiceBeltsville, MD, USA
| | - Sujan Mamidi
- Genomics and Bioinformatics Program, North Dakota State UniversityFargo, ND, USA
- Department of Plant Sciences, North Dakota State UniversityFargo, ND, USA
| | | | - Rian Lee
- Department of Plant Sciences, North Dakota State UniversityFargo, ND, USA
| | - Perry Cregan
- Soybean Genomics and Improvement Laboratory, United States Department of Agriculture, Agricultural Research ServiceBeltsville, MD, USA
| | - Juan M. Osorno
- Department of Plant Sciences, North Dakota State UniversityFargo, ND, USA
| | - Phillip E. McClean
- Genomics and Bioinformatics Program, North Dakota State UniversityFargo, ND, USA
- Department of Plant Sciences, North Dakota State UniversityFargo, ND, USA
| |
Collapse
|
774
|
Zielinski D, Markus B, Sheikh M, Gymrek M, Chu C, Zaks M, Srinivasan B, Hoffman JD, Aizenbud D, Erlich Y. OTX2 duplication is implicated in hemifacial microsomia. PLoS One 2014; 9:e96788. [PMID: 24816892 PMCID: PMC4016008 DOI: 10.1371/journal.pone.0096788] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2013] [Accepted: 04/11/2014] [Indexed: 12/21/2022] Open
Abstract
Hemifacial microsomia (HFM) is the second most common facial anomaly after cleft lip and palate. The phenotype is highly variable and most cases are sporadic. We investigated the disorder in a large pedigree with five affected individuals spanning eight meioses. Whole-exome sequencing results indicated the absence of a pathogenic coding point mutation. A genome-wide survey of segmental variations identified a 1.3 Mb duplication of chromosome 14q22.3 in all affected individuals that was absent in more than 1000 chromosomes of ethnically matched controls. The duplication was absent in seven additional sporadic HFM cases, which is consistent with the known heterogeneity of the disorder. To find the critical gene in the duplicated region, we analyzed signatures of human craniofacial disease networks, mouse expression data, and predictions of dosage sensitivity. All of these approaches implicated OTX2 as the most likely causal gene. Moreover, OTX2 is a known oncogenic driver in medulloblastoma, a condition that was diagnosed in the proband during the course of the study. Our findings suggest a role for OTX2 dosage sensitivity in human craniofacial development and raise the possibility of a shared etiology between a subtype of hemifacial microsomia and medulloblastoma.
Collapse
Affiliation(s)
- Dina Zielinski
- Whitehead Institute for Biomedical Research, Cambridge, Massachusetts, United States of America
| | - Barak Markus
- Whitehead Institute for Biomedical Research, Cambridge, Massachusetts, United States of America
| | - Mona Sheikh
- Whitehead Institute for Biomedical Research, Cambridge, Massachusetts, United States of America
| | - Melissa Gymrek
- Whitehead Institute for Biomedical Research, Cambridge, Massachusetts, United States of America
- Harvard-MIT Division of Health Sciences and Technology, MIT, Cambridge, Massachusetts, United States of America
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
- Department of Molecular Biology and Diabetes Unit, Massachusetts General Hospital, Boston, Massachusetts, United States of America
| | - Clement Chu
- Counsyl, South San Francisco, California, United States of America
| | - Marta Zaks
- Rambam Health Care Campus, Haifa, Israel
| | | | - Jodi D. Hoffman
- Division of Genetics, Tufts Medical Center, Boston, Massachusetts, United States of America
| | | | - Yaniv Erlich
- Whitehead Institute for Biomedical Research, Cambridge, Massachusetts, United States of America
- * E-mail:
| |
Collapse
|
775
|
Mutlu N, Garipler G, Akdoğan E, Dunn CD. Activation of the pleiotropic drug resistance pathway can promote mitochondrial DNA retention by fusion-defective mitochondria in Saccharomyces cerevisiae. G3 (BETHESDA, MD.) 2014; 4:1247-58. [PMID: 24807265 PMCID: PMC4455774 DOI: 10.1534/g3.114.010330] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/01/2014] [Accepted: 05/05/2014] [Indexed: 11/18/2022]
Abstract
Genetic and microscopic approaches using Saccharomyces cerevisiae have identified many proteins that play a role in mitochondrial dynamics, but it is possible that other proteins and pathways that play a role in mitochondrial division and fusion remain to be discovered. Mutants lacking mitochondrial fusion are characterized by rapid loss of mitochondrial DNA. We took advantage of a petite-negative mutant that is unable to survive mitochondrial DNA loss to select for mutations that allow cells with fusion-deficient mitochondria to maintain the mitochondrial genome on fermentable medium. Next-generation sequencing revealed that all identified suppressor mutations not associated with known mitochondrial division components were localized to PDR1 or PDR3, which encode transcription factors promoting drug resistance. Further studies revealed that at least one, if not all, of these suppressor mutations dominantly increases resistance to known substrates of the pleiotropic drug resistance pathway. Interestingly, hyperactivation of this pathway did not significantly affect mitochondrial shape, suggesting that mitochondrial division was not greatly affected. Our results reveal an intriguing genetic connection between pleiotropic drug resistance and mitochondrial dynamics.
Collapse
Affiliation(s)
- Nebibe Mutlu
- Department of Molecular Biology and Genetic, Koç University, Sarıyer, İstanbul, 34450, Turkey
| | - Görkem Garipler
- Department of Molecular Biology and Genetic, Koç University, Sarıyer, İstanbul, 34450, Turkey
| | - Emel Akdoğan
- Department of Molecular Biology and Genetic, Koç University, Sarıyer, İstanbul, 34450, Turkey
| | - Cory D Dunn
- Department of Molecular Biology and Genetic, Koç University, Sarıyer, İstanbul, 34450, Turkey
| |
Collapse
|
776
|
Pavy N, Deschênes A, Blais S, Lavigne P, Beaulieu J, Isabel N, Mackay J, Bousquet J. The landscape of nucleotide polymorphism among 13,500 genes of the conifer picea glauca, relationships with functions, and comparison with medicago truncatula. Genome Biol Evol 2014; 5:1910-25. [PMID: 24065735 PMCID: PMC3814201 DOI: 10.1093/gbe/evt143] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
Gene families differ in composition, expression, and chromosomal organization between conifers and angiosperms, but little is known regarding nucleotide polymorphism. Using various sequencing strategies, an atlas of 212k high-confidence single nucleotide polymorphisms (SNPs) with a validation rate of more than 92% was developed for the conifer white spruce (Picea glauca). Nonsynonymous and synonymous SNPs were annotated over the corresponding 13,498 white spruce genes representative of 2,457 known gene families. Patterns of nucleotide polymorphisms were analyzed by estimating the ratio of nonsynonymous to synonymous numbers of substitutions per site (A/S). A general excess of synonymous SNPs was expected and observed. However, the analysis from several perspectives enabled to identify groups of genes harboring an excess of nonsynonymous SNPs, thus potentially under positive selection. Four known gene families harbored such an excess: dehydrins, ankyrin-repeats, AP2/DREB, and leucine-rich repeat. Conifer-specific sequences were also generally associated with the highest A/S ratios. A/S values were also distributed asymmetrically across genes specifically expressed in megagametophytes, roots, or in both, harboring on average an excess of nonsynonymous SNPs. These patterns confirm that the breadth of gene expression is a contributing factor to the evolution of nucleotide polymorphism. The A/S ratios of Medicago truncatula genes were also analyzed: several gene families shared between P. glauca and M. truncatula data sets had similar excess of synonymous or nonsynonymous SNPs. However, a number of families with high A/S ratios were found specific to P. glauca, suggesting cases of divergent evolution at the functional level.
Collapse
Affiliation(s)
- Nathalie Pavy
- Canada Research Chair in Forest and Environmental Genomics, Centre for Forest Research and Institute for Systems and Integrative Biology, Université Laval, Québec, Canada
| | | | | | | | | | | | | | | |
Collapse
|
777
|
Chan THM, Lin CH, Qi L, Fei J, Li Y, Yong KJ, Liu M, Song Y, Chow RKK, Ng VHE, Yuan YF, Tenen DG, Guan XY, Chen L. A disrupted RNA editing balance mediated by ADARs (Adenosine DeAminases that act on RNA) in human hepatocellular carcinoma. Gut 2014; 63:832-43. [PMID: 23766440 PMCID: PMC3995272 DOI: 10.1136/gutjnl-2012-304037] [Citation(s) in RCA: 181] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
OBJECTIVE Hepatocellular carcinoma (HCC) is a heterogeneous tumour displaying a complex variety of genetic and epigenetic changes. In human cancers, aberrant post-transcriptional modifications, such as alternative splicing and RNA editing, may lead to tumour specific transcriptome diversity. DESIGN By utilising large scale transcriptome sequencing of three paired HCC clinical specimens and their adjacent non-tumour (NT) tissue counterparts at depth, we discovered an average of 20 007 inferred A to I (adenosine to inosine) RNA editing events in transcripts. The roles of the double stranded RNA specific ADAR (Adenosine DeAminase that act on RNA) family members (ADARs) and the altered gene specific editing patterns were investigated in clinical specimens, cell models and mice. RESULTS HCC displays a severely disrupted A to I RNA editing balance. ADAR1 and ADAR2 manipulate the A to I imbalance of HCC via their differential expression in HCC compared with NT liver tissues. Patients with ADAR1 overexpression and ADAR2 downregulation in tumours demonstrated an increased risk of liver cirrhosis and postoperative recurrence and had poor prognoses. Due to the differentially expressed ADAR1 and ADAR2 in tumours, the altered gene specific editing activities, which was reflected by the hyper-editing of FLNB (filamin B, β) and the hypo-editing of COPA (coatomer protein complex, subunit α), are closely associated with HCC pathogenesis. In vitro and in vivo functional assays prove that ADAR1 functions as an oncogene while ADAR2 has tumour suppressive ability in HCC. CONCLUSIONS These findings highlight the fact that the differentially expressed ADARs in tumours, which are responsible for an A to I editing imbalance, has great prognostic value and diagnostic potential for HCC.
Collapse
Affiliation(s)
- Tim Hon Man Chan
- Cancer Science Institute of Singapore, National University of Singapore, Singapore,Department of Clinical Oncology, Li Ka Shing Faculty of Medicine, University of Hong Kong, Hong Kong, China,State Key Laboratory of Liver Research, Li Ka Shing Faculty of Medicine, University of Hong Kong, Hong Kong, China
| | - Chi Ho Lin
- Genome Research Centre, Li Ka Shing Faculty of Medicine, University of Hong Kong, Hong Kong, China
| | - Lihua Qi
- Cancer Science Institute of Singapore, National University of Singapore, Singapore
| | - Jing Fei
- Cancer Science Institute of Singapore, National University of Singapore, Singapore
| | - Yan Li
- Department of Clinical Oncology, Li Ka Shing Faculty of Medicine, University of Hong Kong, Hong Kong, China,State Key Laboratory of Liver Research, Li Ka Shing Faculty of Medicine, University of Hong Kong, Hong Kong, China,State Key Laboratory of Oncology in Southern China, Cancer Centre, Sun Yat-sen University Cancer Centre, Guangzhou, China
| | - Kol Jia Yong
- Cancer Science Institute of Singapore, National University of Singapore, Singapore
| | - Ming Liu
- Department of Clinical Oncology, Li Ka Shing Faculty of Medicine, University of Hong Kong, Hong Kong, China,State Key Laboratory of Liver Research, Li Ka Shing Faculty of Medicine, University of Hong Kong, Hong Kong, China
| | - Yangyang Song
- Department of Clinical Oncology, Li Ka Shing Faculty of Medicine, University of Hong Kong, Hong Kong, China,State Key Laboratory of Liver Research, Li Ka Shing Faculty of Medicine, University of Hong Kong, Hong Kong, China
| | - Raymond Kwok Kei Chow
- Department of Clinical Oncology, Li Ka Shing Faculty of Medicine, University of Hong Kong, Hong Kong, China
| | - Vanessa Hui En Ng
- Cancer Science Institute of Singapore, National University of Singapore, Singapore
| | - Yun-Fei Yuan
- Department of Hepatobiliary Oncology, Sun Yat-sen University Cancer Centre, Guangzhou, China
| | - Daniel G Tenen
- Cancer Science Institute of Singapore, National University of Singapore, Singapore,Harvard Stem Cell Institute, Harvard Medical School, Boston, Massachusetts, USA
| | - Xin-Yuan Guan
- Department of Clinical Oncology, Li Ka Shing Faculty of Medicine, University of Hong Kong, Hong Kong, China,State Key Laboratory of Liver Research, Li Ka Shing Faculty of Medicine, University of Hong Kong, Hong Kong, China,State Key Laboratory of Oncology in Southern China, Cancer Centre, Sun Yat-sen University Cancer Centre, Guangzhou, China
| | - Leilei Chen
- Cancer Science Institute of Singapore, National University of Singapore, Singapore
| |
Collapse
|
778
|
Mapping small effect mutations in Saccharomyces cerevisiae: impacts of experimental design and mutational properties. G3-GENES GENOMES GENETICS 2014; 4:1205-16. [PMID: 24789747 PMCID: PMC4455770 DOI: 10.1534/g3.114.011783] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Genetic variants identified by mapping are biased toward large phenotypic effects because of methodologic challenges for detecting genetic variants with small phenotypic effects. Recently, bulk segregant analysis combined with next-generation sequencing (BSA-seq) was shown to be a powerful and cost-effective way to map small effect variants in natural populations. Here, we examine the power of BSA-seq for efficiently mapping small effect mutations isolated from a mutagenesis screen. Specifically, we determined the impact of segregant population size, intensity of phenotypic selection to collect segregants, number of mitotic generations between meiosis and sequencing, and average sequencing depth on power for mapping mutations with a range of effects on the phenotypic mean and standard deviation as well as relative fitness. We then used BSA-seq to map the mutations responsible for three ethyl methanesulfonate−induced mutant phenotypes in Saccharomyces cerevisiae. These mutants display small quantitative variation in the mean expression of a fluorescent reporter gene (−3%, +7%, and +10%). Using a genetic background with increased meiosis rate, a reliable mating type marker, and fluorescence-activated cell sorting to efficiently score large segregating populations and isolate cells with extreme phenotypes, we successfully mapped and functionally confirmed a single point mutation responsible for the mutant phenotype in all three cases. Our simulations and experimental data show that the effects of a causative site not only on the mean phenotype, but also on its standard deviation and relative fitness should be considered when mapping genetic variants in microorganisms such as yeast that require population growth steps for BSA-seq.
Collapse
|
779
|
Zuzarte PC, Denroche RE, Fehringer G, Katzov-Eckert H, Hung RJ, McPherson JD. A two-dimensional pooling strategy for rare variant detection on next-generation sequencing platforms. PLoS One 2014; 9:e93455. [PMID: 24728235 PMCID: PMC3984111 DOI: 10.1371/journal.pone.0093455] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2013] [Accepted: 03/04/2014] [Indexed: 11/18/2022] Open
Abstract
We describe a method for pooling and sequencing DNA from a large number of individual samples while preserving information regarding sample identity. DNA from 576 individuals was arranged into four 12 row by 12 column matrices and then pooled by row and by column resulting in 96 total pools with 12 individuals in each pool. Pooling of DNA was carried out in a two-dimensional fashion, such that DNA from each individual is present in exactly one row pool and exactly one column pool. By considering the variants observed in the rows and columns of a matrix we are able to trace rare variants back to the specific individuals that carry them. The pooled DNA samples were enriched over a 250 kb region previously identified by GWAS to significantly predispose individuals to lung cancer. All 96 pools (12 row and 12 column pools from 4 matrices) were barcoded and sequenced on an Illumina HiSeq 2000 instrument with an average depth of coverage greater than 4,000×. Verification based on Ion PGM sequencing confirmed the presence of 91.4% of confidently classified SNVs assayed. In this way, each individual sample is sequenced in multiple pools providing more accurate variant calling than a single pool or a multiplexed approach. This provides a powerful method for rare variant detection in regions of interest at a reduced cost to the researcher.
Collapse
Affiliation(s)
- Philip C. Zuzarte
- Genome Technologies, Ontario Institute for Cancer Research, Toronto, Ontario, Canada
| | - Robert E. Denroche
- Genome Technologies, Ontario Institute for Cancer Research, Toronto, Ontario, Canada
| | - Gordon Fehringer
- Lunenfeld-Tanenbaum Research Institute of Mount Sinai Hospital, Toronto, Ontario, Canada
| | - Hagit Katzov-Eckert
- Lunenfeld-Tanenbaum Research Institute of Mount Sinai Hospital, Toronto, Ontario, Canada
| | - Rayjean J. Hung
- Lunenfeld-Tanenbaum Research Institute of Mount Sinai Hospital, Toronto, Ontario, Canada
- Division of Epidemiology, Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada
| | - John D. McPherson
- Genome Technologies, Ontario Institute for Cancer Research, Toronto, Ontario, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada
- * E-mail:
| |
Collapse
|
780
|
Torri F, Dinov ID, Zamanyan A, Hobel S, Genco A, Petrosyan P, Clark AP, Liu Z, Eggert P, Pierce J, Knowles JA, Ames J, Kesselman C, Toga AW, Potkin SG, Vawter MP, Macciardi F. Next generation sequence analysis and computational genomics using graphical pipeline workflows. Genes (Basel) 2014; 3:545-75. [PMID: 23139896 PMCID: PMC3490498 DOI: 10.3390/genes3030545] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Whole-genome and exome sequencing have already proven to be essential and powerful methods to identify genes responsible for simple Mendelian inherited disorders. These methods can be applied to complex disorders as well, and have been adopted as one of the current mainstream approaches in population genetics. These achievements have been made possible by next generation sequencing (NGS) technologies, which require substantial bioinformatics resources to analyze the dense and complex sequence data. The huge analytical burden of data from genome sequencing might be seen as a bottleneck slowing the publication of NGS papers at this time, especially in psychiatric genetics. We review the existing methods for processing NGS data, to place into context the rationale for the design of a computational resource. We describe our method, the Graphical Pipeline for Computational Genomics (GPCG), to perform the computational steps required to analyze NGS data. The GPCG implements flexible workflows for basic sequence alignment, sequence data quality control, single nucleotide polymorphism analysis, copy number variant identification, annotation, and visualization of results. These workflows cover all the analytical steps required for NGS data, from processing the raw reads to variant calling and annotation. The current version of the pipeline is freely available at http://pipeline.loni.ucla.edu. These applications of NGS analysis may gain clinical utility in the near future (e.g., identifying miRNA signatures in diseases) when the bioinformatics approach is made feasible. Taken together, the annotation tools and strategies that have been developed to retrieve information and test hypotheses about the functional role of variants present in the human genome will help to pinpoint the genetic risk factors for psychiatric disorders.
Collapse
Affiliation(s)
- Federica Torri
- Department of Psychiatry and Human Behavior, University of California, Irvine, CA 92617, USA; E-Mails: (F.T.); (S.G.P.)
- Biomedical Informatics Research Network (BIRN), Information Sciences Institute, University of Southern California, Los Angeles, CA 90292, USA; E-Mails: (I.D.D.); (J.A.); (C.K.); (A.W.T.)
| | - Ivo D. Dinov
- Biomedical Informatics Research Network (BIRN), Information Sciences Institute, University of Southern California, Los Angeles, CA 90292, USA; E-Mails: (I.D.D.); (J.A.); (C.K.); (A.W.T.)
- Laboratory of Neuro Imaging (LONI), University of California, Los Angeles, CA 90095, USA; E-Mails: (A.Z.); (S.H.); (A.G.); (P.P.); (Z.L.); (P.E.); (J.P.)
| | - Alen Zamanyan
- Laboratory of Neuro Imaging (LONI), University of California, Los Angeles, CA 90095, USA; E-Mails: (A.Z.); (S.H.); (A.G.); (P.P.); (Z.L.); (P.E.); (J.P.)
| | - Sam Hobel
- Laboratory of Neuro Imaging (LONI), University of California, Los Angeles, CA 90095, USA; E-Mails: (A.Z.); (S.H.); (A.G.); (P.P.); (Z.L.); (P.E.); (J.P.)
| | - Alex Genco
- Laboratory of Neuro Imaging (LONI), University of California, Los Angeles, CA 90095, USA; E-Mails: (A.Z.); (S.H.); (A.G.); (P.P.); (Z.L.); (P.E.); (J.P.)
| | - Petros Petrosyan
- Laboratory of Neuro Imaging (LONI), University of California, Los Angeles, CA 90095, USA; E-Mails: (A.Z.); (S.H.); (A.G.); (P.P.); (Z.L.); (P.E.); (J.P.)
| | - Andrew P. Clark
- Zilkha Neurogenetic Institute, USC Keck School of Medicine, Los Angeles, CA 90033, USA; E-Mails: (A.P.C.); (J.A.K.)
| | - Zhizhong Liu
- Laboratory of Neuro Imaging (LONI), University of California, Los Angeles, CA 90095, USA; E-Mails: (A.Z.); (S.H.); (A.G.); (P.P.); (Z.L.); (P.E.); (J.P.)
| | - Paul Eggert
- Laboratory of Neuro Imaging (LONI), University of California, Los Angeles, CA 90095, USA; E-Mails: (A.Z.); (S.H.); (A.G.); (P.P.); (Z.L.); (P.E.); (J.P.)
- Department of Computer Science, University of California, Los Angeles, CA 90095, USA
| | - Jonathan Pierce
- Laboratory of Neuro Imaging (LONI), University of California, Los Angeles, CA 90095, USA; E-Mails: (A.Z.); (S.H.); (A.G.); (P.P.); (Z.L.); (P.E.); (J.P.)
| | - James A. Knowles
- Zilkha Neurogenetic Institute, USC Keck School of Medicine, Los Angeles, CA 90033, USA; E-Mails: (A.P.C.); (J.A.K.)
| | - Joseph Ames
- Biomedical Informatics Research Network (BIRN), Information Sciences Institute, University of Southern California, Los Angeles, CA 90292, USA; E-Mails: (I.D.D.); (J.A.); (C.K.); (A.W.T.)
| | - Carl Kesselman
- Biomedical Informatics Research Network (BIRN), Information Sciences Institute, University of Southern California, Los Angeles, CA 90292, USA; E-Mails: (I.D.D.); (J.A.); (C.K.); (A.W.T.)
| | - Arthur W. Toga
- Biomedical Informatics Research Network (BIRN), Information Sciences Institute, University of Southern California, Los Angeles, CA 90292, USA; E-Mails: (I.D.D.); (J.A.); (C.K.); (A.W.T.)
- Laboratory of Neuro Imaging (LONI), University of California, Los Angeles, CA 90095, USA; E-Mails: (A.Z.); (S.H.); (A.G.); (P.P.); (Z.L.); (P.E.); (J.P.)
| | - Steven G. Potkin
- Department of Psychiatry and Human Behavior, University of California, Irvine, CA 92617, USA; E-Mails: (F.T.); (S.G.P.)
- Biomedical Informatics Research Network (BIRN), Information Sciences Institute, University of Southern California, Los Angeles, CA 90292, USA; E-Mails: (I.D.D.); (J.A.); (C.K.); (A.W.T.)
| | - Marquis P. Vawter
- Functional Genomics Laboratory, Department of Psychiatry And Human Behavior, School of Medicine, University of California, Irvine, CA 92697, USA; E-Mail:
| | - Fabio Macciardi
- Department of Psychiatry and Human Behavior, University of California, Irvine, CA 92617, USA; E-Mails: (F.T.); (S.G.P.)
- Biomedical Informatics Research Network (BIRN), Information Sciences Institute, University of Southern California, Los Angeles, CA 90292, USA; E-Mails: (I.D.D.); (J.A.); (C.K.); (A.W.T.)
- Author to whom correspondence should be addressed; E-Mail: ; Tel.: +1-949-824-4559; Fax: +1-949-824-2072
| |
Collapse
|
781
|
Drury S, Boustred C, Tekman M, Stanescu H, Kleta R, Lench N, Chitty LS, Scott RH. A novel homozygous ERCC5 truncating mutation in a family with prenatal arthrogryposis--further evidence of genotype-phenotype correlation. Am J Med Genet A 2014; 164A:1777-83. [PMID: 24700531 DOI: 10.1002/ajmg.a.36506] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2013] [Accepted: 01/30/2014] [Indexed: 11/10/2022]
Abstract
We report on a family with five fetuses conceived to first cousin parents presenting with abnormal ultrasound findings including contractures and microcephaly. Cerebellar hypoplasia and ventriculomegaly were also present in two and fetal edema developed in the one fetus that survived beyond 24 weeks of gestation. Linkage studies of 15 members of the family, including four affecteds, were undertaken followed by exome sequencing of one affected individual and their parents. Analysis of exome data was restricted to the 9.3 Mb largest shared region of homozygosity identified by linkage; a single novel homozygous mutation in the proband that was heterozygous in the parents (ERCC5 c.2766dupA, p.Leu923ThrfsX7) was identified. This segregated with disease. ERCC5 is a component of the nucleotide excision repair machinery and biallelic mutations in the gene have previously been associated with xeroderma pigmentosum (group G), Cockayne syndrome and the more severe cerebrooculofacioskeletal syndrome. The phenotype in the family we report on is consistent with a severe manifestation of cerebrooculofacioskeletal syndrome. Our data broaden the reported clinical spectrum of ERCC5 mutations and provide further evidence of genotype-phenotype correlation with truncating mutations being associated with severe phenotypes. They also demonstrate the molecular diagnostic power of a combined approach of linkage studies and exome sequencing in families with rare, genetically heterogeneous disorders and a well described pedigree.
Collapse
Affiliation(s)
- Suzanne Drury
- NE Thames Regional Genetics Service, Great Ormond Street Hospital for Children, London, United Kingdom
| | | | | | | | | | | | | | | |
Collapse
|
782
|
Vergara IA, Tarailo-Graovac M, Frech C, Wang J, Qin Z, Zhang T, She R, Chu JSC, Wang K, Chen N. Genome-wide variations in a natural isolate of the nematode Caenorhabditis elegans. BMC Genomics 2014; 15:255. [PMID: 24694239 PMCID: PMC4023591 DOI: 10.1186/1471-2164-15-255] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2013] [Accepted: 03/03/2014] [Indexed: 12/02/2022] Open
Abstract
Background Increasing genetic and phenotypic differences found among natural isolates of C. elegans have encouraged researchers to explore the natural variation of this nematode species. Results Here we report on the identification of genomic differences between the reference strain N2 and the Hawaiian strain CB4856, one of the most genetically distant strains from N2. To identify both small- and large-scale genomic variations (GVs), we have sequenced the CB4856 genome using both Roche 454 (~400 bps single reads) and Illumina GA DNA sequencing methods (101 bps paired-end reads). Compared to previously described variants (available in WormBase), our effort uncovered twice as many single nucleotide variants (SNVs) and increased the number of small InDels almost 20-fold. Moreover, we identified and validated large insertions, most of which range from 150 bps to 1.2 kb in length in the CB4856 strain. Identified GVs had a widespread impact on protein-coding sequences, including 585 single-copy genes that have associated severe phenotypes of reduced viability in RNAi and genetics studies. Sixty of these genes are homologs of human genes associated with diseases. Furthermore, our work confirms previously identified GVs associated with differences in behavioural and biological traits between the N2 and CB4856 strains. Conclusions The identified GVs provide a rich resource for future studies that aim to explain the genetic basis for other trait differences between the N2 and CB4856 strains.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | - Nansheng Chen
- Department of Molecular Biology and Biochemistry, Simon Fraser University, 8888 University Drive, Burnaby, British Columbia V5A 1S6, Canada.
| |
Collapse
|
783
|
Characterizing the molecular basis of attenuation of Marek's disease virus via in vitro serial passage identifies de novo mutations in the helicase-primase subunit gene UL5 and other candidates associated with reduced virulence. J Virol 2014; 88:6232-42. [PMID: 24648463 DOI: 10.1128/jvi.03869-13] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
UNLABELLED Marek's disease (MD) is a lymphoproliferative disease of chickens caused by the oncogenic Gallid herpesvirus 2, commonly known as Marek's disease virus (MDV). MD vaccines, the primary control method, are often generated by repeated in vitro serial passage of this highly cell-associated virus to attenuate virulent MDV strains. To understand the genetic basis of attenuation, we used experimental evolution by serially passing three virulent MDV replicates generated from an infectious bacterial artificial chromosome (BAC) clone. All replicates became completely or highly attenuated, indicating that de novo mutation, and not selection among quasispecies existing in a strain, is the primary driving force for the reduction in virulence. Sequence analysis of the attenuated replicates revealed 41 to 95 single-nucleotide variants (SNVs) at 2% or higher frequency in each population and several candidate genes containing high-frequency, nonsynonymous mutations. Five candidate mutations were incorporated into recombinant viruses to determine their in vivo effect. SNVs within UL42 (DNA polymerase auxiliary subunit) and UL46 (tegument) had no measurable influence, while two independent mutations in LORF2 (a gene of unknown function) improved survival time of birds but did not alter disease incidence. A fifth SNV located within UL5 (helicase-primase subunit) greatly reduced in vivo viral replication, increased survival time of birds, and resulted in only 0 to 11% disease incidence. This study shows that multiple genes, often within pathways involving DNA replication and transcriptional regulation, are involved in de novo attenuation of MDV and provides targets for the rational design of future MD vaccines. IMPORTANCE Marek's disease virus (MDV) is a very important pathogen in chickens that costs the worldwide poultry industry $1 billion to $2 billion annually. Marek's disease (MD) vaccines, the primary control method, are often produced by passing virulent strains in cell culture until attenuated. To understand this process, we identified all the changes in the viral genome that occurred during repeated cell passage. We find that a single mutation in the UL5 gene, which encodes a viral protein necessary for DNA replication, reduces disease incidence by 90% or more. In addition, other candidate genes were identified. This information should lead to the development of more effective and rationally designed MD vaccines leading to improved animal health and welfare and lower costs to consumers.
Collapse
|
784
|
Pinosio S, González-Martínez SC, Bagnoli F, Cattonaro F, Grivet D, Marroni F, Lorenzo Z, Pausas JG, Verdú M, Vendramin GG. First insights into the transcriptome and development of new genomic tools of a widespread circum-Mediterranean tree species, Pinus halepensis Mill. Mol Ecol Resour 2014; 14:846-56. [PMID: 24450970 DOI: 10.1111/1755-0998.12232] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2013] [Revised: 01/15/2014] [Accepted: 01/17/2014] [Indexed: 11/30/2022]
Abstract
Aleppo pine (Pinus halepensis Mill.) is a relevant conifer species for studying adaptive responses to drought and fire regimes in the Mediterranean region. In this study, we performed Illumina next-generation sequencing of two phenotypically divergent Aleppo pine accessions with the aims of (i) characterizing the transcriptome through Illumina RNA-Seq on trees phenotypically divergent for adaptive traits linked to fire adaptation and drought, (ii) performing a functional annotation of the assembled transcriptome, (iii) identifying genes with accelerated evolutionary rates, (iv) studying the expression levels of the annotated genes and (v) developing gene-based markers for population genomic and association genetic studies. The assembled transcriptome consisted of 48,629 contigs and covered about 54.6 Mbp. The comparison of Aleppo pine transcripts to Picea sitchensis protein-coding sequences resulted in the detection of 34,014 SNPs across species, with a Ka /Ks average value of 0.216, suggesting that the majority of the assembled genes are under negative selection. Several genes were differentially expressed across the two pine accessions with contrasted phenotypes, including a glutathione-s-transferase, a cellulose synthase and a cobra-like protein. A large number of new markers (3334 amplifiable SSRs and 28,236 SNPs) have been identified which should facilitate future population genomics and association genetics in this species. A 384-SNP Oligo Pool Assay for genotyping with the Illumina VeraCode technology has been designed which showed an high overall SNP conversion rate (76.6%). Our results showed that Illumina next-generation sequencing is a valuable technology to obtain an extensive overview on whole transcriptomes of nonmodel species with large genomes.
Collapse
Affiliation(s)
- S Pinosio
- Institute of Biosciences and Bioresources, National Research Council, Via Madonna del Piano 10, 50019, Sesto Fiorentino, Firenze, Italy; IGA Technology Services s.r.l., Via J. Linussio, 51, 33100, Udine, Italy
| | | | | | | | | | | | | | | | | | | |
Collapse
|
785
|
Abstract
BACKGROUND Facial infiltrating lipomatosis is a nonheritable disorder characterized by hemifacial soft-tissue and skeletal overgrowth, precocious dental development, macrodontia, hemimacroglossia, and mucosal neuromas. The authors tested the hypothesis that this condition is caused by a somatic mutation in the phosphatidylinositide-3 kinase (PI3K) signaling pathway, which has been indicted in other anomalies with overgrowth. METHODS The authors extracted DNA from abnormal tissue in six individuals, generated sequencing libraries, enriched the libraries for 26 genes involved in the PI3K pathway, and designed and applied a sequential filtering strategy to analyze the sequence data for mosaic mutations. RESULTS Unfiltered sequence data contained variant reads affecting ~12 percent of basepairs in the targeted genes. Filtering reduced the fraction of targeted basepairs containing variant reads to ~0.008 percent, allowing the authors to identify causal missense mutations in PIK3CA (p.E453K, p.E542K, p.H1047R, or p.H1047L) in each affected tissue sample. CONCLUSIONS Affected tissue from individuals with facial infiltrating lipomatosis contains PIK3CA mutations that have previously been reported in cancers and in affected tissue from other nonheritable, overgrowth disorders, including congenital lipomatous overgrowth, vascular, epidermal, and skeletal anomalies syndrome, Klippel-Trenaunay syndrome, hemimegalencephaly, fibroadipose overgrowth, and macrodactyly. Because PIK3CA encodes a catalytic subunit of PI3K, and in vitro studies have shown that the overgrowth-associated mutations increase this enzyme's activity, PI3K inhibitors currently in clinical trials for patients with cancer may have a therapeutic role in patients with facial infiltrating lipomatosis. The strategy used to identify somatic mutations in patients with facial infiltrating lipomatosis is applicable to other somatic mosaic disorders that have allelic heterogeneity.
Collapse
|
786
|
Fine mapping of genome activation in bovine embryos by RNA sequencing. Proc Natl Acad Sci U S A 2014; 111:4139-44. [PMID: 24591639 DOI: 10.1073/pnas.1321569111] [Citation(s) in RCA: 243] [Impact Index Per Article: 22.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
During maternal-to-embryonic transition control of embryonic development gradually switches from maternal RNAs and proteins stored in the oocyte to gene products generated after embryonic genome activation (EGA). Detailed insight into the onset of embryonic transcription is obscured by the presence of maternal transcripts. Using the bovine model system, we established by RNA sequencing a comprehensive catalogue of transcripts in germinal vesicle and metaphase II oocytes, and in embryos at the four-cell, eight-cell, 16-cell, and blastocyst stages. These were produced by in vitro fertilization of Bos taurus taurus oocytes with sperm from a Bos taurus indicus bull to facilitate parent-specific transcriptome analysis. Transcripts from 12.4 to 13.7 × 10(3) different genes were detected in the various developmental stages. EGA was analyzed by (i) detection of embryonic transcripts, which are not present in oocytes; (ii) detection of transcripts from the paternal allele; and (iii) detection of primary transcripts with intronic sequences. These strategies revealed (i) 220, (ii) 937, and (iii) 6,848 genes to be activated from the four-cell to the blastocyst stage. The largest proportion of gene activation [i.e., (i) 59%, (ii) 42%, and (iii) 58%] was found in eight-cell embryos, indicating major EGA at this stage. Gene ontology analysis of genes activated at the four-cell stage identified categories related to RNA processing, translation, and transport, consistent with preparation for major EGA. Our study provides the largest transcriptome data set of bovine oocyte maturation and early embryonic development and detailed insight into the timing of embryonic activation of specific genes.
Collapse
|
787
|
Cheng AY, Teo YY, Ong RTH. Assessing single nucleotide variant detection and genotype calling on whole-genome sequenced individuals. ACTA ACUST UNITED AC 2014; 30:1707-13. [PMID: 24558117 DOI: 10.1093/bioinformatics/btu067] [Citation(s) in RCA: 59] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
MOTIVATION Whole-genome sequencing (WGS) is now routinely used for the detection and identification of genetic variants, particularly single nucleotide polymorphisms (SNPs) in humans, and this has provided valuable new insights into human diversity, population histories and genetic association studies of traits and diseases. However, this relies on accurate detection and genotyping calling of the polymorphisms present in the samples sequenced. To minimize cost, the majority of current WGS studies, including the 1000 Genomes Project (1 KGP) have adopted low coverage sequencing of large number of samples, where such designs have inadvertently influenced the development of variant calling methods on WGS data. Assessment of variant accuracy are usually performed on the same set of low coverage individuals or a smaller number of deeply sequenced individuals. It is thus unclear how these variant calling methods would fare for a dataset of ∼100 samples from a population not part of the 1 KGP that have been sequenced at various coverage depths. AVAILABILITY AND IMPLEMENTATION Using down-sampling of the sequencing reads obtained from the Singapore Sequencing Malay Project (SSMP), and a set of SNP calls from the same individuals genotyped on the Illumina Omni1-Quad array, we assessed the sensitivity of SNP detection, accuracy of genotype calls made and variant accuracy for six commonly used variant calling methods of GATK, SAMtools, Consensus Assessment of Sequence and Variation (CASAVA), VarScan, glfTools and SOAPsnp. The results indicate that at 5× coverage depth, the multi-sample callers of GATK and SAMtools yield the best accuracy particularly if the study samples are called together with a large number of individuals such as those from 1000 Genomes Project. If study samples are sequenced at a high coverage depth such as 30×, CASAVA has the highest variant accuracy as compared with the other variant callers assessed.
Collapse
Affiliation(s)
- Anthony Youzhi Cheng
- Saw Swee Hock School of Public Health, National University of Singapore, Singapore 117597, Life Sciences Institute, National University of Singapore, Singapore 117456, Department of Statistics and Applied Probability, National University of Singapore, Singapore 117546, NUS Graduate School for Integrative Science and Engineering, National University of Singapore, Singapore 117456 and Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore 138672
| | - Yik-Ying Teo
- Saw Swee Hock School of Public Health, National University of Singapore, Singapore 117597, Life Sciences Institute, National University of Singapore, Singapore 117456, Department of Statistics and Applied Probability, National University of Singapore, Singapore 117546, NUS Graduate School for Integrative Science and Engineering, National University of Singapore, Singapore 117456 and Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore 138672Saw Swee Hock School of Public Health, National University of Singapore, Singapore 117597, Life Sciences Institute, National University of Singapore, Singapore 117456, Department of Statistics and Applied Probability, National University of Singapore, Singapore 117546, NUS Graduate School for Integrative Science and Engineering, National University of Singapore, Singapore 117456 and Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore 138672Saw Swee Hock School of Public Health, National University of Singapore, Singapore 117597, Life Sciences Institute, National University of Singapore, Singapore 117456, Department of Statistics and Applied Probability, National University of Singapore, Singapore 117546, NUS Graduate School for Integrative Science and Engineering, National University of Singapore, Singapore 117456 and Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore 138672Saw Swee Hock School of Public Health, National University of Singapore, Singapore 117597, Life Sciences Institute, National University of Singapore, Singapore 117456, Department of Statistics and Applied Probability, National University of Singapore, Singapore 117546, NUS Graduate School for Integrative Science and Engineering, National University of Singapore, Singapore 117456 and Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore 138672Saw Swee Hock School of Public Health, National University of Singapore, Singapore 11759
| | - Rick Twee-Hee Ong
- Saw Swee Hock School of Public Health, National University of Singapore, Singapore 117597, Life Sciences Institute, National University of Singapore, Singapore 117456, Department of Statistics and Applied Probability, National University of Singapore, Singapore 117546, NUS Graduate School for Integrative Science and Engineering, National University of Singapore, Singapore 117456 and Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore 138672
| |
Collapse
|
788
|
Anvar SY, van der Gaag KJ, van der Heijden JWF, Veltrop MHAM, Vossen RHAM, de Leeuw RH, Breukel C, Buermans HPJ, Verbeek JS, de Knijff P, den Dunnen JT, Laros JFJ. TSSV: a tool for characterization of complex allelic variants in pure and mixed genomes. Bioinformatics 2014; 30:1651-9. [DOI: 10.1093/bioinformatics/btu068] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
|
789
|
Shearman JR, Sangsrakru D, Ruang-areerate P, Sonthirod C, Uthaipaisanwong P, Yoocha T, Poopear S, Theerawattanasuk K, Tragoonrung S, Tangphatsornruang S. Assembly and analysis of a male sterile rubber tree mitochondrial genome reveals DNA rearrangement events and a novel transcript. BMC PLANT BIOLOGY 2014; 14:45. [PMID: 24512148 PMCID: PMC3925788 DOI: 10.1186/1471-2229-14-45] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/20/2013] [Accepted: 02/07/2014] [Indexed: 05/29/2023]
Abstract
BACKGROUND The rubber tree, Hevea brasiliensis, is an important plant species that is commercially grown to produce latex rubber in many countries. The rubber tree variety BPM 24 exhibits cytoplasmic male sterility, inherited from the variety GT 1. RESULTS We constructed the rubber tree mitochondrial genome of a cytoplasmic male sterile variety, BPM 24, using 454 sequencing, including 8 kb paired-end libraries, plus Illumina paired-end sequencing. We annotated this mitochondrial genome with the aid of Illumina RNA-seq data and performed comparative analysis. We then compared the sequence of BPM 24 to the contigs of the published rubber tree, variety RRIM 600, and identified a rearrangement that is unique to BPM 24 resulting in a novel transcript containing a portion of atp9. CONCLUSIONS The novel transcript is consistent with changes that cause cytoplasmic male sterility through a slight reduction to ATP production efficiency. The exhaustive nature of the search rules out alternative causes and supports previous findings of novel transcripts causing cytoplasmic male sterility.
Collapse
Affiliation(s)
- Jeremy R Shearman
- National Center for Genetic Engineering and Biotechnology, 113 Thailand Science Park, Paholyothin Road, Khlong Nueng, Khlong Luang, Pathumthani 12120, Thailand
| | - Duangjai Sangsrakru
- National Center for Genetic Engineering and Biotechnology, 113 Thailand Science Park, Paholyothin Road, Khlong Nueng, Khlong Luang, Pathumthani 12120, Thailand
| | - Panthita Ruang-areerate
- National Center for Genetic Engineering and Biotechnology, 113 Thailand Science Park, Paholyothin Road, Khlong Nueng, Khlong Luang, Pathumthani 12120, Thailand
| | - Chutima Sonthirod
- National Center for Genetic Engineering and Biotechnology, 113 Thailand Science Park, Paholyothin Road, Khlong Nueng, Khlong Luang, Pathumthani 12120, Thailand
| | - Pichahpuk Uthaipaisanwong
- National Center for Genetic Engineering and Biotechnology, 113 Thailand Science Park, Paholyothin Road, Khlong Nueng, Khlong Luang, Pathumthani 12120, Thailand
| | - Thippawan Yoocha
- National Center for Genetic Engineering and Biotechnology, 113 Thailand Science Park, Paholyothin Road, Khlong Nueng, Khlong Luang, Pathumthani 12120, Thailand
| | - Supannee Poopear
- National Center for Genetic Engineering and Biotechnology, 113 Thailand Science Park, Paholyothin Road, Khlong Nueng, Khlong Luang, Pathumthani 12120, Thailand
| | - Kanikar Theerawattanasuk
- Rubber Research Institute of Thailand (RRIT), Department of Agriculture, Ministry of Agriculture and Cooperatives, 50 Phaholyothin Road, Chatuchack, Bangkok 10900, Thailand
| | - Somvong Tragoonrung
- National Center for Genetic Engineering and Biotechnology, 113 Thailand Science Park, Paholyothin Road, Khlong Nueng, Khlong Luang, Pathumthani 12120, Thailand
| | - Sithichoke Tangphatsornruang
- National Center for Genetic Engineering and Biotechnology, 113 Thailand Science Park, Paholyothin Road, Khlong Nueng, Khlong Luang, Pathumthani 12120, Thailand
| |
Collapse
|
790
|
Okada Y, Diogo D, Greenberg JD, Mouassess F, Achkar WAL, Fulton RS, Denny JC, Gupta N, Mirel D, Gabriel S, Li G, Kremer JM, Pappas DA, Carroll RJ, Eyler AE, Trynka G, Stahl EA, Cui J, Saxena R, Coenen MJH, Guchelaar HJ, Huizinga TWJ, Dieudé P, Mariette X, Barton A, Canhão H, Fonseca JE, de Vries N, Tak PP, Moreland LW, Bridges SL, Miceli-Richard C, Choi HK, Kamatani Y, Galan P, Lathrop M, Raj T, De Jager PL, Raychaudhuri S, Worthington J, Padyukov L, Klareskog L, Siminovitch KA, Gregersen PK, Mardis ER, Arayssi T, Kazkaz LA, Plenge RM. Integration of sequence data from a Consanguineous family with genetic data from an outbred population identifies PLB1 as a candidate rheumatoid arthritis risk gene. PLoS One 2014; 9:e87645. [PMID: 24520335 PMCID: PMC3919745 DOI: 10.1371/journal.pone.0087645] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2013] [Accepted: 12/19/2013] [Indexed: 12/30/2022] Open
Abstract
Integrating genetic data from families with highly penetrant forms of disease together with genetic data from outbred populations represents a promising strategy to uncover the complete frequency spectrum of risk alleles for complex traits such as rheumatoid arthritis (RA). Here, we demonstrate that rare, low-frequency and common alleles at one gene locus, phospholipase B1 (PLB1), might contribute to risk of RA in a 4-generation consanguineous pedigree (Middle Eastern ancestry) and also in unrelated individuals from the general population (European ancestry). Through identity-by-descent (IBD) mapping and whole-exome sequencing, we identified a non-synonymous c.2263G>C (p.G755R) mutation at the PLB1 gene on 2q23, which significantly co-segregated with RA in family members with a dominant mode of inheritance (P = 0.009). We further evaluated PLB1 variants and risk of RA using a GWAS meta-analysis of 8,875 RA cases and 29,367 controls of European ancestry. We identified significant contributions of two independent non-coding variants near PLB1 with risk of RA (rs116018341 [MAF = 0.042] and rs116541814 [MAF = 0.021], combined P = 3.2×10−6). Finally, we performed deep exon sequencing of PLB1 in 1,088 RA cases and 1,088 controls (European ancestry), and identified suggestive dispersion of rare protein-coding variant frequencies between cases and controls (P = 0.049 for C-alpha test and P = 0.055 for SKAT). Together, these data suggest that PLB1 is a candidate risk gene for RA. Future studies to characterize the full spectrum of genetic risk in the PLB1 genetic locus are warranted.
Collapse
Affiliation(s)
- Yukinori Okada
- Division of Rheumatology, Immunology, and Allergy, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Program in Medical and Population Genetics, Broad Institute, Cambridge, Massachusetts, United States of America
- Department of Human Genetics and Disease Diversity, Tokyo Medical and Dental University Graduate School of Medical and Dental Sciences, Tokyo, Japan
- Laboratory for Statistical Analysis, Center for Integrative Medical Sciences, RIKEN, Yokohama, Japan
| | - Dorothee Diogo
- Division of Rheumatology, Immunology, and Allergy, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Program in Medical and Population Genetics, Broad Institute, Cambridge, Massachusetts, United States of America
| | - Jeffrey D. Greenberg
- New York University Hospital for Joint Diseases, New York, New York, United States of America
| | - Faten Mouassess
- Molecular Biology and Biotechnology Department, Human Genetics Division, Damascus, Syria
| | - Walid A. L. Achkar
- Molecular Biology and Biotechnology Department, Human Genetics Division, Damascus, Syria
| | - Robert S. Fulton
- The Genome Institute, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Joshua C. Denny
- Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, Tennessee, United States of America
| | - Namrata Gupta
- Program in Medical and Population Genetics, Broad Institute, Cambridge, Massachusetts, United States of America
| | - Daniel Mirel
- Program in Medical and Population Genetics, Broad Institute, Cambridge, Massachusetts, United States of America
| | - Stacy Gabriel
- Program in Medical and Population Genetics, Broad Institute, Cambridge, Massachusetts, United States of America
| | - Gang Li
- Division of Rheumatology, Immunology, and Allergy, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Joel M. Kremer
- Department of Medicine, Albany Medical Center and The Center for Rheumatology, Albany, New York, United States of America
| | - Dimitrios A. Pappas
- Division of Rheumatology, Department of Medicine, New York, Presbyterian Hospital, College of Physicians and Surgeons, Columbia University, New York, New York, United States of America
| | - Robert J. Carroll
- Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, Tennessee, United States of America
| | - Anne E. Eyler
- Department of Medicine, Vanderbilt University School of Medicine, Nashville, Tennessee, United States of America
| | - Gosia Trynka
- Division of Rheumatology, Immunology, and Allergy, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Program in Medical and Population Genetics, Broad Institute, Cambridge, Massachusetts, United States of America
| | - Eli A. Stahl
- The Department of Psychiatry at Mount Sinai School of Medicine, New York, New York, United States of America
| | - Jing Cui
- Division of Rheumatology, Immunology, and Allergy, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Richa Saxena
- Center for Human Genetics Research, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Marieke J. H. Coenen
- Department of Human Genetics, Radboud University Medical Centre, Nijmegen, The Netherlands
| | - Henk-Jan Guchelaar
- Department of Clinical Pharmacy and Toxicology, Leiden University Medical Center, Leiden, The Netherlands
| | - Tom W. J. Huizinga
- Department of Rheumatology, Leiden University Medical Centre, Leiden, The Netherlands
| | - Philippe Dieudé
- Service de Rhumatologie et INSERM U699 Hôpital Bichat Claude Bernard, Assistance Publique des Hôpitaux de Paris, Paris, France
- Université Paris 7-Diderot, Paris, France
| | - Xavier Mariette
- Institut National de la Santé et de la Recherche Médicale (INSERM) U1012, Université Paris-Sud, Rhumatologie, Hôpitaux Universitaires Paris-Sud, Assistance Publique-Hôpitaux de Paris (AP-HP), Le Kremlin Bicêtre, France
| | - Anne Barton
- Arthritis Research UK Epidemiology Unit, Centre for Musculoskeletal Research, University of Manchester, Manchester Academic Health Science Centre, Manchester, United Kingdom
| | - Helena Canhão
- Rheumatology Research Unit, Instituto de Medicina Molecular, Faculdade de Medicina da Universidade de Lisboa, Lisbon, Portugal
- Rheumatology Department, Santa Maria Hospital–CHLN, Lisbon, Portugal
| | - João E. Fonseca
- Rheumatology Research Unit, Instituto de Medicina Molecular, Faculdade de Medicina da Universidade de Lisboa, Lisbon, Portugal
- Rheumatology Department, Santa Maria Hospital–CHLN, Lisbon, Portugal
| | - Niek de Vries
- Department of Clinical Immunology and Rheumatology & Department of Genome Analysis, Academic Medical Center/University of Amsterdam, Amsterdam, The Netherlands
| | - Paul P. Tak
- Department of Clinical Immunology and Rheumatology, Academic Medical Center/University of Amsterdam, Amsterdam, The Netherlands
- GlaxoSmithKline, Stevenage, United Kingdom
| | - Larry W. Moreland
- Division of Rheumatology and Clinical Immunology, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - S. Louis Bridges
- Division of Clinical Immunology and Rheumatology, Department of Medicine, University of Alabama at Birmingham, Birmingham, Alabama, United States of America
| | - Corinne Miceli-Richard
- Institut National de la Santé et de la Recherche Médicale (INSERM) U1012, Université Paris-Sud, Rhumatologie, Hôpitaux Universitaires Paris-Sud, Assistance Publique-Hôpitaux de Paris (AP-HP), Le Kremlin Bicêtre, France
| | - Hyon K. Choi
- Channing Laboratory, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Section of Rheumatology, Boston University School of Medicine, Boston, Massachusetts, United States of America
- Clinical Epidemiology Research and Training Unit, Boston University School of Medicine, Boston, Massachusetts, United States of America
| | - Yoichiro Kamatani
- Laboratory for Statistical Analysis, Center for Integrative Medical Sciences, RIKEN, Yokohama, Japan
- Centre d'Etude du Polymorphisme Humain (CEPH), Paris, France
| | - Pilar Galan
- Université Paris 13 Sorbonne Paris Cité, UREN (Nutritional Epidemiology Research Unit), Inserm (U557), Inra (U1125), Cnam, Bobigny, France
| | - Mark Lathrop
- McGill University and Génome Québec Innovation Centre, Montréal, Canada
| | - Towfique Raj
- Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Program in Medical and Population Genetics, Broad Institute, Cambridge, Massachusetts, United States of America
- Program in Translational NeuroPsychiatric Genomics, Institute for the Neurosciences, Department of Neurology, Brigham and Women's Hospital, Boston, Massachusetts, United States of America
| | - Philip L. De Jager
- Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Program in Medical and Population Genetics, Broad Institute, Cambridge, Massachusetts, United States of America
- Program in Translational NeuroPsychiatric Genomics, Institute for the Neurosciences, Department of Neurology, Brigham and Women's Hospital, Boston, Massachusetts, United States of America
| | - Soumya Raychaudhuri
- Division of Rheumatology, Immunology, and Allergy, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Program in Medical and Population Genetics, Broad Institute, Cambridge, Massachusetts, United States of America
- NIHR Manchester Musculoskeletal Biomedical, Research Unit, Central Manchester NHS Foundation Trust, Manchester Academic Health Sciences Centre, Manchester, United Kingdom
| | - Jane Worthington
- Arthritis Research UK Epidemiology Unit, Centre for Musculoskeletal Research, University of Manchester, Manchester Academic Health Science Centre, Manchester, United Kingdom
- National Institute for Health Research, Manchester Musculoskeletal Biomedical Research Unit, Central Manchester University Hospitals National Health Service Foundation Trust, Manchester Academic Health Sciences Centre, Manchester, United Kingdom
| | - Leonid Padyukov
- Rheumatology Unit, Department of Medicine (Solna), Karolinska Institutet, Stockholm, Sweden
| | - Lars Klareskog
- Rheumatology Unit, Department of Medicine (Solna), Karolinska Institutet, Stockholm, Sweden
| | - Katherine A. Siminovitch
- Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Canada
- Toronto General Research Institute, Toronto, Canada
- Department of Medicine, University of Toronto, Toronto, Canada
| | - Peter K. Gregersen
- The Feinstein Institute for Medical Research, North Shore–Long Island Jewish Health System, Manhasset, New York, United States of America
| | - Elaine R. Mardis
- The Genome Institute, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Thurayya Arayssi
- Weill Cornell Medical College-Qatar, Education City, Doha, Qatar
| | - Layla A. Kazkaz
- Tishreen Hospital, Damascus, Syria
- Syrian Association for Rheumatology, Damascus, Syria
| | - Robert M. Plenge
- Division of Rheumatology, Immunology, and Allergy, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Program in Medical and Population Genetics, Broad Institute, Cambridge, Massachusetts, United States of America
- * E-mail:
| |
Collapse
|
791
|
Sequential transcriptome analysis of human liver cancer indicates late stage acquisition of malignant traits. J Hepatol 2014; 60:346-353. [PMID: 24512821 PMCID: PMC3943679 DOI: 10.1016/j.jhep.2013.10.014] [Citation(s) in RCA: 68] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/03/2013] [Revised: 09/30/2013] [Accepted: 10/09/2013] [Indexed: 12/13/2022]
Abstract
BACKGROUND & AIMS Human hepatocarcinogenesis is as a multi-step process starting from dysplastic lesions to early carcinomas (eHCC) that ultimately progress to HCC (pHCC). However, the sequential molecular alterations driving malignant transformation of the pre-neoplastic lesions are not clearly defined. This lack of information represents a major challenge in the clinical management of patients at risk. METHODS We applied next-generation transcriptome sequencing to tumor-free surrounding liver (n = 7), low- (n = 4) and high-grade (n = 9) dysplastic lesions, eHCC (n = 5) and pHCC (n = 3) from 8 HCC patients with hepatitis B infection. Integrative analyses of genetic and transcriptomic changes were performed to characterize the genomic alterations during hepatocarcinogenesis. RESULTS We report that changes in transcriptomes of early lesions including eHCC were modest and surprisingly homogenous. Extensive genetic alterations and subsequent activation of prognostic adverse signaling pathways occurred only late during hepatocarcinogenesis and were centered on TGFβ, WNT, NOTCH, and EMT-related genes highlighting the molecular diversity of pHCC. We further identify IGFALS as a key genetic determinant preferentially down-regulated in pHCC. CONCLUSIONS Our results define new hallmarks in molecular stratification and therapy options for patients at risk for HCC, and merit larger prospective investigations to develop a modified clinical-decision making algorithm based on the individualized next-generation sequencing analyses.
Collapse
|
792
|
Ning L, Liu G, Li G, Hou Y, Tong Y, He J. Current challenges in the bioinformatics of single cell genomics. Front Oncol 2014; 4:7. [PMID: 24478987 PMCID: PMC3902584 DOI: 10.3389/fonc.2014.00007] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2013] [Accepted: 01/12/2014] [Indexed: 11/13/2022] Open
Abstract
Single cell genomics is a rapidly growing field with many new techniques emerging in the past few years. However, few bioinformatics tools specific for single cell genomics analysis are available. Single cell DNA/RNA sequencing data usually have low genome coverage and high amplification bias, which makes bioinformatics analysis challenging. Many current bioinformatics tools developed for bulk cell sequencing do not work well with single cell sequencing data. Here, we summarize current challenges in the bioinformatics analysis of single cell genomic DNA sequencing and single cell transcriptomes. These challenges include calling copy number variations, identifying mutated genes in tumor samples, reconstructing cell lineages, recovering low abundant transcripts, and improving the accuracy of quantitative analysis of transcripts. Development in single cell genomics bioinformatics analysis will promote the application of this technology to basic biology and medical research.
Collapse
Affiliation(s)
- Luwen Ning
- Department of Biology, South University of Science and Technology of China , Shenzhen , China
| | | | | | | | - Yin Tong
- Department of Biology, South University of Science and Technology of China , Shenzhen , China
| | - Jiankui He
- Department of Biology, South University of Science and Technology of China , Shenzhen , China
| |
Collapse
|
793
|
Grunert M, Dorn C, Schueler M, Dunkel I, Schlesinger J, Mebus S, Alexi-Meskishvili V, Perrot A, Wassilew K, Timmermann B, Hetzer R, Berger F, Sperling SR. Rare and private variations in neural crest, apoptosis and sarcomere genes define the polygenic background of isolated Tetralogy of Fallot. Hum Mol Genet 2014; 23:3115-28. [PMID: 24459294 DOI: 10.1093/hmg/ddu021] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
Tetralogy of Fallot (TOF) is the most common cyanotic congenital heart disease. Its genetic basis is demonstrated by an increased recurrence risk in siblings and familial cases. However, the majority of TOF are sporadic, isolated cases of undefined origin and it had been postulated that rare and private autosomal variations in concert define its genetic basis. To elucidate this hypothesis, we performed a multilevel study using targeted re-sequencing and whole-transcriptome profiling. We developed a novel concept based on a gene's mutation frequency to unravel the polygenic origin of TOF. We show that isolated TOF is caused by a combination of deleterious private and rare mutations in genes essential for apoptosis and cell growth, the assembly of the sarcomere as well as for the neural crest and secondary heart field, the cellular basis of the right ventricle and its outflow tract. Affected genes coincide in an interaction network with significant disturbances in expression shared by cases with a mutually affected TOF gene. The majority of genes show continuous expression during adulthood, which opens a new route to understand the diversity in the long-term clinical outcome of TOF cases. Our findings demonstrate that TOF has a polygenic origin and that understanding the genetic basis can lead to novel diagnostic and therapeutic routes. Moreover, the novel concept of the gene mutation frequency is a versatile measure and can be applied to other open genetic disorders.
Collapse
Affiliation(s)
- Marcel Grunert
- Group of Cardiovascular Genetics, Department of Vertebrate Genomics and Cardiovascular Genetics, Experimental and Clinical Research Center, Charité-Universitätsmedizin Berlin and Max Delbrück Center (MDC) for Molecular Medicine, Berlin 13125, Germany
| | - Cornelia Dorn
- Group of Cardiovascular Genetics, Department of Vertebrate Genomics and Cardiovascular Genetics, Experimental and Clinical Research Center, Charité-Universitätsmedizin Berlin and Max Delbrück Center (MDC) for Molecular Medicine, Berlin 13125, Germany Department of Biology, Chemistry and Pharmacy, Free University of Berlin, Berlin 14195, Germany
| | - Markus Schueler
- Group of Cardiovascular Genetics, Department of Vertebrate Genomics and Cardiovascular Genetics, Experimental and Clinical Research Center, Charité-Universitätsmedizin Berlin and Max Delbrück Center (MDC) for Molecular Medicine, Berlin 13125, Germany
| | - Ilona Dunkel
- Group of Cardiovascular Genetics, Department of Vertebrate Genomics and
| | - Jenny Schlesinger
- Group of Cardiovascular Genetics, Department of Vertebrate Genomics and Cardiovascular Genetics, Experimental and Clinical Research Center, Charité-Universitätsmedizin Berlin and Max Delbrück Center (MDC) for Molecular Medicine, Berlin 13125, Germany
| | - Siegrun Mebus
- Department of Pediatric Cardiology, German Heart Institute Berlin and Department of Pediatric Cardiology, Charité-Universitätsmedizin Berlin, Berlin 13353, Germany
| | | | - Andreas Perrot
- Cardiovascular Genetics, Experimental and Clinical Research Center, Charité-Universitätsmedizin Berlin and Max Delbrück Center (MDC) for Molecular Medicine, Berlin 13125, Germany
| | | | - Bernd Timmermann
- Next Generation Service Group, Max Planck Institute for Molecular Genetics, Berlin 14195, Germany
| | | | - Felix Berger
- Department of Pediatric Cardiology, German Heart Institute Berlin and Department of Pediatric Cardiology, Charité-Universitätsmedizin Berlin, Berlin 13353, Germany
| | - Silke R Sperling
- Group of Cardiovascular Genetics, Department of Vertebrate Genomics and Cardiovascular Genetics, Experimental and Clinical Research Center, Charité-Universitätsmedizin Berlin and Max Delbrück Center (MDC) for Molecular Medicine, Berlin 13125, Germany Department of Biology, Chemistry and Pharmacy, Free University of Berlin, Berlin 14195, Germany
| |
Collapse
|
794
|
D'Auria G, Schneider MV, Moya A. Live genomics for pathogen monitoring in public health. Pathogens 2014; 3:93-108. [PMID: 25437609 PMCID: PMC4235738 DOI: 10.3390/pathogens3010093] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2013] [Revised: 12/16/2013] [Accepted: 01/07/2014] [Indexed: 02/07/2023] Open
Abstract
Whole genome analysis based on next generation sequencing (NGS) now represents an affordable framework in public health systems. Robust analytical pipelines of genomic data provides in short laps of time (hours) information about taxonomy, comparative genomics (pan-genome) and single polymorphisms profiles. Pathogenic organisms of interest can be tracked at the genomic level, allowing monitoring at one-time several variables including: epidemiology, pathogenicity, resistance to antibiotics, virulence, persistence factors, mobile elements and adaptation features. Such information can be obtained not only at large spectra, but also at the "local" level, such as in the event of a recurrent or emergency outbreak. This paper reviews the state of the art in infection diagnostics in the context of modern NGS methodologies. We describe how actuation protocols in a public health environment will benefit from a "streaming approach" (pipeline). Such pipeline would NGS data quality assessment, data mining for comparative analysis, searching differential genetic features, such as virulence, resistance persistence factors and mutation profiles (SNPs and InDels) and formatted "comprehensible" results. Such analytical protocols will enable a quick response to the needs of locally circumscribed outbreaks, providing information on the causes of resistance and genetic tracking elements for rapid detection, and monitoring actuations for present and future occurrences.
Collapse
Affiliation(s)
- Giuseppe D'Auria
- Genómica y Salud, Fundación para el Fomento de la Investigación Sanitaria y Biomédica de la Comunidad Valenciana (FISABIO-Salud Pública), Avenida de Cataluña 21, 46020 Valencia, Spain.
| | | | - Andrés Moya
- Genómica y Salud, Fundación para el Fomento de la Investigación Sanitaria y Biomédica de la Comunidad Valenciana (FISABIO-Salud Pública), Avenida de Cataluña 21, 46020 Valencia, Spain.
| |
Collapse
|
795
|
Khoddami V, Cairns BR. Transcriptome-wide target profiling of RNA cytosine methyltransferases using the mechanism-based enrichment procedure Aza-IP. Nat Protoc 2014; 9:337-61. [PMID: 24434802 DOI: 10.1038/nprot.2014.014] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
Cytosine methylation within RNA is common, but its full scope and functions are poorly understood, as the RNA targets of most mammalian cytosine RNA methyltransferases (m(5)C-RMTs) remain uncharacterized. To enable their characterization, we developed a mechanism-based method for transcriptome-wide m(5)C-RMT target profiling. All characterized mammalian m(5)C-RMTs form a reversible covalent intermediate with their cytosine substrate-a covalent linkage that is trapped when conducted on the cytosine analog 5-azacytidine (5-aza-C). We used this property to develop Aza-immunoprecipitation (Aza-IP), a methodology to form stable m(5)C-RMT-RNA linkages in cell culture, followed by IP and high-throughput sequencing, to identify direct RNA substrates of m(5)C-RMTs. Remarkably, a cytosine-to-guanine (C→G) transversion occurs specifically at target cytosines, allowing the simultaneous identification of the precise target cytosine within each RNA. Thus, Aza-IP reports only direct RNA substrates and the C→G transversion provides an important criterion for target cytosine identification, which is not available in alternative approaches. Here we present a step-by-step protocol for Aza-IP and downstream analysis, designed to reveal identification of substrate RNAs and precise cytosine targets of m(5)C-RMTs. The entire protocol takes 40-50 d to complete.
Collapse
Affiliation(s)
- Vahid Khoddami
- Howard Hughes Medical Institute (HHMI), Department of Oncological Sciences, Huntsman Cancer Institute, University of Utah School of Medicine, Salt Lake City, Utah, USA
| | - Bradley R Cairns
- Howard Hughes Medical Institute (HHMI), Department of Oncological Sciences, Huntsman Cancer Institute, University of Utah School of Medicine, Salt Lake City, Utah, USA
| |
Collapse
|
796
|
McElroy K, Thomas T, Luciani F. Deep sequencing of evolving pathogen populations: applications, errors, and bioinformatic solutions. MICROBIAL INFORMATICS AND EXPERIMENTATION 2014; 4:1. [PMID: 24428920 PMCID: PMC3902414 DOI: 10.1186/2042-5783-4-1] [Citation(s) in RCA: 55] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/14/2013] [Accepted: 01/07/2014] [Indexed: 12/15/2022]
Abstract
Deep sequencing harnesses the high throughput nature of next generation sequencing technologies to generate population samples, treating information contained in individual reads as meaningful. Here, we review applications of deep sequencing to pathogen evolution. Pioneering deep sequencing studies from the virology literature are discussed, such as whole genome Roche-454 sequencing analyses of the dynamics of the rapidly mutating pathogens hepatitis C virus and HIV. Extension of the deep sequencing approach to bacterial populations is then discussed, including the impacts of emerging sequencing technologies. While it is clear that deep sequencing has unprecedented potential for assessing the genetic structure and evolutionary history of pathogen populations, bioinformatic challenges remain. We summarise current approaches to overcoming these challenges, in particular methods for detecting low frequency variants in the context of sequencing error and reconstructing individual haplotypes from short reads.
Collapse
Affiliation(s)
- Kerensa McElroy
- Centre for Marine Bio-Innovation and School of Biotechnology and Biomolecular Sciences, UNSW, Sydney, NSW 2052, Australia.
| | | | | |
Collapse
|
797
|
Duitama J, Quintero JC, Cruz DF, Quintero C, Hubmann G, Foulquié-Moreno MR, Verstrepen KJ, Thevelein JM, Tohme J. An integrated framework for discovery and genotyping of genomic variants from high-throughput sequencing experiments. Nucleic Acids Res 2014; 42:e44. [PMID: 24413664 PMCID: PMC3973327 DOI: 10.1093/nar/gkt1381] [Citation(s) in RCA: 77] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Recent advances in high-throughput sequencing (HTS) technologies and computing capacity have produced unprecedented amounts of genomic data that have unraveled the genetics of phenotypic variability in several species. However, operating and integrating current software tools for data analysis still require important investments in highly skilled personnel. Developing accurate, efficient and user-friendly software packages for HTS data analysis will lead to a more rapid discovery of genomic elements relevant to medical, agricultural and industrial applications. We therefore developed Next-Generation Sequencing Eclipse Plug-in (NGSEP), a new software tool for integrated, efficient and user-friendly detection of single nucleotide variants (SNVs), indels and copy number variants (CNVs). NGSEP includes modules for read alignment, sorting, merging, functional annotation of variants, filtering and quality statistics. Analysis of sequencing experiments in yeast, rice and human samples shows that NGSEP has superior accuracy and efficiency, compared with currently available packages for variants detection. We also show that only a comprehensive and accurate identification of repeat regions and CNVs allows researchers to properly separate SNVs from differences between copies of repeat elements. We expect that NGSEP will become a strong support tool to empower the analysis of sequencing data in a wide range of research projects on different species.
Collapse
Affiliation(s)
- Jorge Duitama
- Agrobiodiversity research area, International Center for Tropical Agriculture (CIAT), Km 17 Recta Cali- Palmira, A.A. 6713 Cali, Colombia, Laboratory of Molecular Cell Biology, Department of Biology, Institute of Botany and Microbiology, KU Leuven, Kasteelpark Arenberg 31, B-3001 Leuven-Heverlee, Flanders, Belgium, Department of Molecular Microbiology, VIB, Kasteelpark Arenberg 31, B-3001 Leuven-Heverlee, Flanders, Belgium, VIB Laboratory of Systems Biology, KU Leuven, Gaston Geenslaan 1, B-3001 Leuven-Heverlee, Flanders, Belgium and Laboratory for Genetics and Genomics, Centre of Microbial and Plant Genetics, KU Leuven, Gaston Geenslaan 1, B-3001 Leuven-Heverlee, Flanders, Belgium
- *To whom correspondence should be addressed. Tel: +57 2 4450000; Fax: +57 2 4450073;
| | - Juan Camilo Quintero
- Agrobiodiversity research area, International Center for Tropical Agriculture (CIAT), Km 17 Recta Cali- Palmira, A.A. 6713 Cali, Colombia, Laboratory of Molecular Cell Biology, Department of Biology, Institute of Botany and Microbiology, KU Leuven, Kasteelpark Arenberg 31, B-3001 Leuven-Heverlee, Flanders, Belgium, Department of Molecular Microbiology, VIB, Kasteelpark Arenberg 31, B-3001 Leuven-Heverlee, Flanders, Belgium, VIB Laboratory of Systems Biology, KU Leuven, Gaston Geenslaan 1, B-3001 Leuven-Heverlee, Flanders, Belgium and Laboratory for Genetics and Genomics, Centre of Microbial and Plant Genetics, KU Leuven, Gaston Geenslaan 1, B-3001 Leuven-Heverlee, Flanders, Belgium
| | - Daniel Felipe Cruz
- Agrobiodiversity research area, International Center for Tropical Agriculture (CIAT), Km 17 Recta Cali- Palmira, A.A. 6713 Cali, Colombia, Laboratory of Molecular Cell Biology, Department of Biology, Institute of Botany and Microbiology, KU Leuven, Kasteelpark Arenberg 31, B-3001 Leuven-Heverlee, Flanders, Belgium, Department of Molecular Microbiology, VIB, Kasteelpark Arenberg 31, B-3001 Leuven-Heverlee, Flanders, Belgium, VIB Laboratory of Systems Biology, KU Leuven, Gaston Geenslaan 1, B-3001 Leuven-Heverlee, Flanders, Belgium and Laboratory for Genetics and Genomics, Centre of Microbial and Plant Genetics, KU Leuven, Gaston Geenslaan 1, B-3001 Leuven-Heverlee, Flanders, Belgium
| | - Constanza Quintero
- Agrobiodiversity research area, International Center for Tropical Agriculture (CIAT), Km 17 Recta Cali- Palmira, A.A. 6713 Cali, Colombia, Laboratory of Molecular Cell Biology, Department of Biology, Institute of Botany and Microbiology, KU Leuven, Kasteelpark Arenberg 31, B-3001 Leuven-Heverlee, Flanders, Belgium, Department of Molecular Microbiology, VIB, Kasteelpark Arenberg 31, B-3001 Leuven-Heverlee, Flanders, Belgium, VIB Laboratory of Systems Biology, KU Leuven, Gaston Geenslaan 1, B-3001 Leuven-Heverlee, Flanders, Belgium and Laboratory for Genetics and Genomics, Centre of Microbial and Plant Genetics, KU Leuven, Gaston Geenslaan 1, B-3001 Leuven-Heverlee, Flanders, Belgium
| | - Georg Hubmann
- Agrobiodiversity research area, International Center for Tropical Agriculture (CIAT), Km 17 Recta Cali- Palmira, A.A. 6713 Cali, Colombia, Laboratory of Molecular Cell Biology, Department of Biology, Institute of Botany and Microbiology, KU Leuven, Kasteelpark Arenberg 31, B-3001 Leuven-Heverlee, Flanders, Belgium, Department of Molecular Microbiology, VIB, Kasteelpark Arenberg 31, B-3001 Leuven-Heverlee, Flanders, Belgium, VIB Laboratory of Systems Biology, KU Leuven, Gaston Geenslaan 1, B-3001 Leuven-Heverlee, Flanders, Belgium and Laboratory for Genetics and Genomics, Centre of Microbial and Plant Genetics, KU Leuven, Gaston Geenslaan 1, B-3001 Leuven-Heverlee, Flanders, Belgium
| | - Maria R. Foulquié-Moreno
- Agrobiodiversity research area, International Center for Tropical Agriculture (CIAT), Km 17 Recta Cali- Palmira, A.A. 6713 Cali, Colombia, Laboratory of Molecular Cell Biology, Department of Biology, Institute of Botany and Microbiology, KU Leuven, Kasteelpark Arenberg 31, B-3001 Leuven-Heverlee, Flanders, Belgium, Department of Molecular Microbiology, VIB, Kasteelpark Arenberg 31, B-3001 Leuven-Heverlee, Flanders, Belgium, VIB Laboratory of Systems Biology, KU Leuven, Gaston Geenslaan 1, B-3001 Leuven-Heverlee, Flanders, Belgium and Laboratory for Genetics and Genomics, Centre of Microbial and Plant Genetics, KU Leuven, Gaston Geenslaan 1, B-3001 Leuven-Heverlee, Flanders, Belgium
| | - Kevin J. Verstrepen
- Agrobiodiversity research area, International Center for Tropical Agriculture (CIAT), Km 17 Recta Cali- Palmira, A.A. 6713 Cali, Colombia, Laboratory of Molecular Cell Biology, Department of Biology, Institute of Botany and Microbiology, KU Leuven, Kasteelpark Arenberg 31, B-3001 Leuven-Heverlee, Flanders, Belgium, Department of Molecular Microbiology, VIB, Kasteelpark Arenberg 31, B-3001 Leuven-Heverlee, Flanders, Belgium, VIB Laboratory of Systems Biology, KU Leuven, Gaston Geenslaan 1, B-3001 Leuven-Heverlee, Flanders, Belgium and Laboratory for Genetics and Genomics, Centre of Microbial and Plant Genetics, KU Leuven, Gaston Geenslaan 1, B-3001 Leuven-Heverlee, Flanders, Belgium
| | - Johan M. Thevelein
- Agrobiodiversity research area, International Center for Tropical Agriculture (CIAT), Km 17 Recta Cali- Palmira, A.A. 6713 Cali, Colombia, Laboratory of Molecular Cell Biology, Department of Biology, Institute of Botany and Microbiology, KU Leuven, Kasteelpark Arenberg 31, B-3001 Leuven-Heverlee, Flanders, Belgium, Department of Molecular Microbiology, VIB, Kasteelpark Arenberg 31, B-3001 Leuven-Heverlee, Flanders, Belgium, VIB Laboratory of Systems Biology, KU Leuven, Gaston Geenslaan 1, B-3001 Leuven-Heverlee, Flanders, Belgium and Laboratory for Genetics and Genomics, Centre of Microbial and Plant Genetics, KU Leuven, Gaston Geenslaan 1, B-3001 Leuven-Heverlee, Flanders, Belgium
| | - Joe Tohme
- Agrobiodiversity research area, International Center for Tropical Agriculture (CIAT), Km 17 Recta Cali- Palmira, A.A. 6713 Cali, Colombia, Laboratory of Molecular Cell Biology, Department of Biology, Institute of Botany and Microbiology, KU Leuven, Kasteelpark Arenberg 31, B-3001 Leuven-Heverlee, Flanders, Belgium, Department of Molecular Microbiology, VIB, Kasteelpark Arenberg 31, B-3001 Leuven-Heverlee, Flanders, Belgium, VIB Laboratory of Systems Biology, KU Leuven, Gaston Geenslaan 1, B-3001 Leuven-Heverlee, Flanders, Belgium and Laboratory for Genetics and Genomics, Centre of Microbial and Plant Genetics, KU Leuven, Gaston Geenslaan 1, B-3001 Leuven-Heverlee, Flanders, Belgium
| |
Collapse
|
798
|
MAYBA OLEG, GNAD FLORIAN, PEYTON MICHAEL, ZHANG FAN, WALTER KIMBERLY, DU PAN, HUNTLEY MELANIEA, JIANG ZHAOSHI, LIU JINFENG, HAVERTY PETERM, GENTLEMAN ROBERTC, LI RUIQIANG, MINNA JOHND, LI YINGRUI, SHAMES DAVIDS, ZHANG ZEMIN. Integrative analysis of two cell lines derived from a non-small-lung cancer patient--a panomics approach. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2014:75-86. [PMID: 24297535 PMCID: PMC3940063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Cancer cells derived from different stages of tumor progression may exhibit distinct biological properties, as exemplified by the paired lung cancer cell lines H1993 and H2073. While H1993 was derived from chemo-naive metastasized tumor, H2073 originated from the chemo-resistant primary tumor from the same patient and exhibits strikingly different drug response profile. To understand the underlying genetic and epigenetic bases for their biological properties, we investigated these cells using a wide range of large-scale methods including whole genome sequencing, RNA sequencing, SNP array, DNA methylation array, and de novo genome assembly. We conducted an integrative analysis of both cell lines to distinguish between potential driver and passenger alterations. Although many genes are mutated in these cell lines, the combination of DNA- and RNA-based variant information strongly implicates a small number of genes including TP53 and STK11 as likely drivers. Likewise, we found a diverse set of genes differentially expressed between these cell lines, but only a fraction can be attributed to changes in DNA copy number or methylation. This set included the ABC transporter ABCC4, implicated in drug resistance, and the metastasis associated MET oncogene. While the rich data content allowed us to reduce the space of hypotheses that could explain most of the observed biological properties, we also caution there is a lack of statistical power and inherent limitations in such single patient case studies.
Collapse
Affiliation(s)
- OLEG MAYBA
- Department of Bioinformatics and Computational Biology, Genentech, Inc., South San Francisco, CA 94080, USA
| | - FLORIAN GNAD
- Department of Bioinformatics and Computational Biology, Genentech, Inc., South San Francisco, CA 94080, USA
| | - MICHAEL PEYTON
- Hamon Center for Therapeutic Oncology Research, UT-Southwestern Medical Center, Dallas, TX 75390, USA
| | - FAN ZHANG
- BGI-Shenzhen, Shenzhen 518083, China
| | - KIMBERLY WALTER
- Department of Development Oncology Diagnostics, Genentech, Inc., South San Francisco, CA 94080, USA
| | - PAN DU
- Department of Bioinformatics and Computational Biology, Genentech, Inc., South San Francisco, CA 94080, USA
| | - MELANIE A. HUNTLEY
- Department of Bioinformatics and Computational Biology, Genentech, Inc., South San Francisco, CA 94080, USA
| | - ZHAOSHI JIANG
- Department of Bioinformatics and Computational Biology, Genentech, Inc., South San Francisco, CA 94080, USA
| | - JINFENG LIU
- Department of Bioinformatics and Computational Biology, Genentech, Inc., South San Francisco, CA 94080, USA
| | - PETER M. HAVERTY
- Department of Bioinformatics and Computational Biology, Genentech, Inc., South San Francisco, CA 94080, USA
| | - ROBERT C. GENTLEMAN
- Department of Bioinformatics and Computational Biology, Genentech, Inc., South San Francisco, CA 94080, USA
| | | | - JOHN D. MINNA
- Hamon Center for Therapeutic Oncology Research, UT-Southwestern Medical Center, Dallas, TX 75390, USA
| | | | - DAVID S. SHAMES
- Department of Development Oncology Diagnostics, Genentech, Inc., South San Francisco, CA 94080, USA
| | - ZEMIN ZHANG
- Department of Bioinformatics and Computational Biology, Genentech, Inc., South San Francisco, CA 94080, USA
| |
Collapse
|
799
|
Guo Y, Li CI, Sheng Q, Winther JF, Cai Q, Boice JD, Shyr Y. Very low-level heteroplasmy mtDNA variations are inherited in humans. J Genet Genomics 2013; 40:607-15. [PMID: 24377867 PMCID: PMC4149221 DOI: 10.1016/j.jgg.2013.10.003] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2013] [Revised: 10/24/2013] [Accepted: 10/27/2013] [Indexed: 01/18/2023]
Abstract
Little is known about the inheritance of very low heteroplasmy mitochondria DNA (mtDNA) variations. Even with the development of new next-generation sequencing methods, the practical lower limit of measured heteroplasmy is still about 1% due to the inherent noise level of the sequencing. In this study, we sequenced the mitochondrial genome of 44 individuals using Illumina high-throughput sequencing technology and obtained high-coverage mitochondria sequencing data. Our study population contains many mother-offspring pairs. This unique study design allows us to bypass the usual heteroplasmy limitation by analyzing the correlation of mutation levels at each position in the mtDNA sequence between maternally related pairs and non-related pairs. The study showed that very low heteroplasmy variants, down to almost 0.1%, are inherited maternally and that this inheritance begins to decrease at about 0.5%, corresponding to a bottleneck of about 200 mtDNA.
Collapse
Affiliation(s)
- Yan Guo
- Vanderbilt Ingram Cancer Center, Center for Quantitative Sciences, Nashville, TN 37232, USA.
| | - Chung-I Li
- Department of Applied Mathematics, Chiayi University (NCYU), Chiayi 60004, Taiwan, China
| | - Quanhu Sheng
- Vanderbilt Ingram Cancer Center, Center for Quantitative Sciences, Nashville, TN 37232, USA
| | - Jeanette F Winther
- Institute of Cancer Epidemiology, Danish Cancer Society, Copenhagen DK-2100, Denmark
| | - Qiuyin Cai
- Vanderbilt Epidemiology Center, Vanderbilt University School of Medicine, Nashville, TN 37232, USA
| | - John D Boice
- National Council on Radiation Protection & Measurements, Bethesda, MD 20814, USA
| | - Yu Shyr
- Vanderbilt Ingram Cancer Center, Center for Quantitative Sciences, Nashville, TN 37232, USA.
| |
Collapse
|
800
|
Abstract
Summary: The increasing availability of high-throughput sequencing technologies has led to thousands of human genomes having been sequenced in the past years. Efforts such as the 1000 Genomes Project further add to the availability of human genome variation data. However, to date, there is no method that can map reads of a newly sequenced human genome to a large collection of genomes. Instead, methods rely on aligning reads to a single reference genome. This leads to inherent biases and lower accuracy. To tackle this problem, a new alignment tool BWBBLE is introduced in this article. We (i) introduce a new compressed representation of a collection of genomes, which explicitly tackles the genomic variation observed at every position, and (ii) design a new alignment algorithm based on the Burrows–Wheeler transform that maps short reads from a newly sequenced genome to an arbitrary collection of two or more (up to millions of) genomes with high accuracy and no inherent bias to one specific genome. Availability:http://viq854.github.com/bwbble. Contact:serafim@cs.stanford.edu
Collapse
Affiliation(s)
- Lin Huang
- Department of Computer Science, Stanford University, Stanford, CA 94305, USA
| | | | | |
Collapse
|