801
|
Valentim CLL, Cioli D, Chevalier FD, Cao X, Taylor AB, Holloway SP, Pica-Mattoccia L, Guidi A, Basso A, Tsai IJ, Berriman M, Carvalho-Queiroz C, Almeida M, Aguilar H, Frantz DE, Hart PJ, LoVerde PT, Anderson TJC. Genetic and molecular basis of drug resistance and species-specific drug action in schistosome parasites. Science 2013; 342:1385-9. [PMID: 24263136 PMCID: PMC4136436 DOI: 10.1126/science.1243106] [Citation(s) in RCA: 122] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Oxamniquine resistance evolved in the human blood fluke (Schistosoma mansoni) in Brazil in the 1970s. We crossed parental parasites differing ~500-fold in drug response, determined drug sensitivity and marker segregation in clonally derived second-generation progeny, and identified a single quantitative trait locus (logarithm of odds = 31) on chromosome 6. A sulfotransferase was identified as the causative gene by using RNA interference knockdown and biochemical complementation assays, and we subsequently demonstrated independent origins of loss-of-function mutations in field-derived and laboratory-selected resistant parasites. These results demonstrate the utility of linkage mapping in a human helminth parasite, while crystallographic analyses of protein-drug interactions illuminate the mode of drug action and provide a framework for rational design of oxamniquine derivatives that kill both S. mansoni and S. haematobium, the two species responsible for >99% of schistosomiasis cases worldwide.
Collapse
Affiliation(s)
- Claudia L L Valentim
- Departments of Biochemistry and Pathology, University of Texas Health Science Center, San Antonio, TX 78229, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
802
|
Signor S, Seher T, Kopp A. Genomic resources for multiple species in the Drosophila ananassae species group. Fly (Austin) 2013; 7:47-57. [PMID: 23639891 DOI: 10.4161/fly.22353] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
The development of genomic resources in non-model taxa is essential for understanding the genetic basis of biological diversity. Although the genomes of many Drosophila species have been sequenced, most of the phenotypic diversity in this genus remains to be explored. To facilitate the genetic analysis of interspecific and intraspecific variation, we have generated new genomic resources for seven species and subspecies in the D. ananassae species subgroup. We have generated large amounts of transcriptome sequence data for D. ercepeae, D. merina, D. bipectinata, D. malerkotliana malerkotliana, D. m. pallens, D. pseudoananassae pseudoananassae, and D. p. nigrens. de novo assembly resulted in contigs covering more than half of the predicted transcriptome and matching an average of 59% of annotated genes in the complete genome of D. ananassae. Most contigs, corresponding to an average of 49% of D. ananassae genes, contain sequence polymorphisms that can be used as genetic markers. Subsets of these markers were validated by genotyping the progeny of inter- and intraspecific crosses. The ananassae subgroup is an excellent model system for examining the molecular basis of speciation and phenotypic evolution. The new genomic resources will facilitate the genetic analysis of inter- and intraspecific differences in this lineage. Transcriptome sequencing provides a simple and cost-effective way to identify molecular markers at nearly single-gene density, and is equally applicable to any non-model taxa.
Collapse
Affiliation(s)
- Sarah Signor
- Department of Evolution and Ecology, University of California, Davis, Davis, CA, USA.
| | | | | |
Collapse
|
803
|
Berglund EC, Lindqvist CM, Hayat S, Övernäs E, Henriksson N, Nordlund J, Wahlberg P, Forestier E, Lönnerholm G, Syvänen AC. Accurate detection of subclonal single nucleotide variants in whole genome amplified and pooled cancer samples using HaloPlex target enrichment. BMC Genomics 2013; 14:856. [PMID: 24314227 PMCID: PMC4046713 DOI: 10.1186/1471-2164-14-856] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2013] [Accepted: 11/25/2013] [Indexed: 01/21/2023] Open
Abstract
BACKGROUND Target enrichment and resequencing is a widely used approach for identification of cancer genes and genetic variants associated with diseases. Although cost effective compared to whole genome sequencing, analysis of many samples constitutes a significant cost, which could be reduced by pooling samples before capture. Another limitation to the number of cancer samples that can be analyzed is often the amount of available tumor DNA. We evaluated the performance of whole genome amplified DNA and the power to detect subclonal somatic single nucleotide variants in non-indexed pools of cancer samples using the HaloPlex technology for target enrichment and next generation sequencing. RESULTS We captured a set of 1528 putative somatic single nucleotide variants and germline SNPs, which were identified by whole genome sequencing, with the HaloPlex technology and sequenced to a depth of 792-1752. We found that the allele fractions of the analyzed variants are well preserved during whole genome amplification and that capture specificity or variant calling is not affected. We detected a large majority of the known single nucleotide variants present uniquely in one sample with allele fractions as low as 0.1 in non-indexed pools of up to ten samples. We also identified and experimentally validated six novel variants in the samples included in the pools. CONCLUSION Our work demonstrates that whole genome amplified DNA can be used for target enrichment equally well as genomic DNA and that accurate variant detection is possible in non-indexed pools of cancer samples. These findings show that analysis of a large number of samples is feasible at low cost, even when only small amounts of DNA is available, and thereby significantly increases the chances of indentifying recurrent mutations in cancer samples.
Collapse
Affiliation(s)
- Eva C Berglund
- Department of Medical Sciences, Molecular Medicine and Science for Life Laboratory, Uppsala University, Uppsala, Sweden.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
804
|
Pyle A, Griffin H, Duff J, Zwolinski S, Smertenko T, Yu-Wai-Man P, Santibanez-Koref M, Horvath R, Chinnery PF. Late-onset sacsinopathy diagnosed by exome sequencing and comparative genomic hybridization. J Neurogenet 2013; 27:176-82. [PMID: 24180463 PMCID: PMC4038496 DOI: 10.3109/01677063.2013.831094] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
The molecular diagnosis of adult-onset autosomal recessive cerebellar ataxias remains challenging because of genetic heterogeneity. However, recently developed molecular genetic techniques will potentially revolutionize the diagnostic approach. Here we set out to define the genetic basis of the ataxia in two brothers with no molecular diagnosis. Clinical evaluation was followed by whole-exome second-generation sequencing and comparative genomic hybridization to determine the diagnosis. Whole-exome sequencing identified a hemizygous novel spastic ataxia of Charlevoix-Saguenay (SACS) stop-codon mutation in both brothers (c.13048G→T, p.E4350*) that was present in the mother, but not the father. Comparative genomic hybridization revealed a 0.7-Mb deletion on chromosome 13q12.12 in both brothers, which included SACS and was heterozygous in the asymptomatic father. The milder phenotype, caused by a whole-gene deletion and a stop-codon mutation in SACS, indicates a loss-of-function mechanism in late-onset autosomal recessive spastic ataxia of Charlevoix-Saguenay (ARSACS), and illustrates the importance of chromosomal rearrangements in the investigation of adult-onset ataxia.
Collapse
Affiliation(s)
- Angela Pyle
- Wellcome Centre for Mitochondrial Research, Institute of Genetic Medicine, Newcastle upon Tyne, UK
| | - Helen Griffin
- Wellcome Centre for Mitochondrial Research, Institute of Genetic Medicine, Newcastle upon Tyne, UK
| | - Jennifer Duff
- Wellcome Centre for Mitochondrial Research, Institute of Genetic Medicine, Newcastle upon Tyne, UK
| | - Simon Zwolinski
- Wellcome Centre for Mitochondrial Research, Institute of Genetic Medicine, Newcastle upon Tyne, UK
| | - Tania Smertenko
- Wellcome Centre for Mitochondrial Research, Institute of Genetic Medicine, Newcastle upon Tyne, UK
| | - Patrick Yu-Wai-Man
- Wellcome Centre for Mitochondrial Research, Institute of Genetic Medicine, Newcastle upon Tyne, UK
| | - Mauro Santibanez-Koref
- Wellcome Centre for Mitochondrial Research, Institute of Genetic Medicine, Newcastle upon Tyne, UK
| | - Rita Horvath
- Wellcome Centre for Mitochondrial Research, Institute of Genetic Medicine, Newcastle upon Tyne, UK
| | - Patrick F. Chinnery
- Wellcome Centre for Mitochondrial Research, Institute of Genetic Medicine, Newcastle upon Tyne, UK
| |
Collapse
|
805
|
Koboldt DC, Steinberg KM, Larson DE, Wilson RK, Mardis ER. The next-generation sequencing revolution and its impact on genomics. Cell 2013; 155:27-38. [PMID: 24074859 DOI: 10.1016/j.cell.2013.09.006] [Citation(s) in RCA: 645] [Impact Index Per Article: 53.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2013] [Indexed: 02/07/2023]
Abstract
Genomics is a relatively new scientific discipline, having DNA sequencing as its core technology. As technology has improved the cost and scale of genome characterization over sequencing's 40-year history, the scope of inquiry has commensurately broadened. Massively parallel sequencing has proven revolutionary, shifting the paradigm of genomics to address biological questions at a genome-wide scale. Sequencing now empowers clinical diagnostics and other aspects of medical care, including disease risk, therapeutic identification, and prenatal testing. This Review explores the current state of genomics in the massively parallel sequencing era.
Collapse
Affiliation(s)
- Daniel C Koboldt
- The Genome Institute, School of Medicine, Washington University, St. Louis, MO 63108, USA
| | | | | | | | | |
Collapse
|
806
|
Toonen RJ, Puritz JB, Forsman ZH, Whitney JL, Fernandez-Silva I, Andrews KR, Bird CE. ezRAD: a simplified method for genomic genotyping in non-model organisms. PeerJ 2013; 1:e203. [PMID: 24282669 PMCID: PMC3840413 DOI: 10.7717/peerj.203] [Citation(s) in RCA: 127] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2013] [Accepted: 10/13/2013] [Indexed: 12/17/2022] Open
Abstract
Here, we introduce ezRAD, a novel strategy for restriction site–associated DNA (RAD) that requires little technical expertise or investment in laboratory equipment, and demonstrate its utility for ten non-model organisms across a wide taxonomic range. ezRAD differs from other RAD methods primarily through its use of standard Illumina TruSeq library preparation kits, which makes it possible for any laboratory to send out to a commercial genomic core facility for library preparation and next-generation sequencing with virtually no additional investment beyond the cost of the service itself. This simplification opens RADseq to any lab with the ability to extract DNA and perform a restriction digest. ezRAD also differs from others in its flexibility to use any restriction enzyme (or combination of enzymes) that cuts frequently enough to generate fragments of the desired size range, without requiring the purchase of separate adapters for each enzyme or a sonication step, which can further decrease the cost involved in choosing optimal enzymes for particular species and research questions. We apply this method across a wide taxonomic diversity of non-model organisms to demonstrate the utility and flexibility of our approach. The simplicity of ezRAD makes it particularly useful for the discovery of single nucleotide polymorphisms and targeted amplicon sequencing in natural populations of non-model organisms that have been historically understudied because of lack of genomic information.
Collapse
Affiliation(s)
- Robert J Toonen
- Hawai'i Institute of Marine Biology, School of Ocean & Earth Sciences & Technology, University of Hawai'i at Mānoa , Coconut Island, Kāne'ohe, HI , United States
| | | | | | | | | | | | | |
Collapse
|
807
|
Fanciulli M, Di Bonaventura C, Egeo G, Fattouch J, Dazzo E, Radovic S, Spadotto A, Giallonardo AT, Nobile C. Suggestive linkage of familial mesial temporal lobe epilepsy to chromosome 3q26. Epilepsy Res 2013; 108:232-40. [PMID: 24315020 DOI: 10.1016/j.eplepsyres.2013.11.002] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2013] [Revised: 09/26/2013] [Accepted: 11/03/2013] [Indexed: 12/01/2022]
Abstract
PURPOSE To describe the clinical findings in a family with a benign form of mesial temporal lobe epilepsy and to identify the causative genetic factors. METHODS All participants were personally interviewed and underwent neurologic examination. The affected subjects underwent EEG and most of them neuroradiological examinations (MRI). All family members were genotyped with the HumanCytoSNP-12 v1.0 beadchip and linkage analysis was performed with Merlin and Simwalk2 programs. Exome sequencing was performed on HiSeq2000, after exome capture with SureSelect 50 Mb kit v2.0. RESULTS The family had 6 members with temporal lobe epilepsy. Age at seizure onset ranged from 8 to 13 years. Five patients had epigastric auras often associated to oro-alimentary automatic activity, 3 patients presented loss of contact, and 2 experienced secondary generalizations. Febrile seizures occurred in 2 family members, 1 of whom also had temporal lobe epilepsy. EEG showed focal slow waves and epileptic abnormalities on temporal regions in 1 patient and was normal in the other affected individuals. MRI was normal in all temporal lobe epilepsy patients. We performed single nucleotide polymorphism-array linkage analysis of the family and found suggestive evidence of linkage (LOD score=2.106) to a region on chromosome 3q26. Haplotype reconstruction supported the linkage data and showed that the majority of unaffected family members carried the haplotype at risk. Whole exome sequencing failed to identify pathogenic mutations in genes of the candidate region. CONCLUSIONS Our data suggest the existence of a novel locus for benign familial mesial temporal lobe epilepsy on chromosome 3q26. Our failure to identify pathogenic mutations in genes of this region may be due to limitations of the exome sequencing technology.
Collapse
Affiliation(s)
| | | | - Gabriella Egeo
- Department of Neurological Sciences, University of Rome "Sapienza", Roma, Italy; IRCCS San Raffaele Pisana, Roma, Italy
| | - Jinane Fattouch
- Department of Neurological Sciences, University of Rome "Sapienza", Roma, Italy
| | - Emanuela Dazzo
- CNR - Institute of Neurosciences, Section of Padua, Padova, Italy
| | | | | | | | - Carlo Nobile
- CNR - Institute of Neurosciences, Section of Padua, Padova, Italy.
| |
Collapse
|
808
|
Konczal M, Koteja P, Stuglik MT, Radwan J, Babik W. Accuracy of allele frequency estimation using pooled RNA-Seq. Mol Ecol Resour 2013; 14:381-92. [PMID: 24119300 DOI: 10.1111/1755-0998.12186] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2013] [Revised: 09/30/2013] [Accepted: 10/06/2013] [Indexed: 11/28/2022]
Abstract
For nonmodel organisms, genome-wide information that describes functionally relevant variation may be obtained by RNA-Seq following de novo transcriptome assembly. While sequencing has become relatively inexpensive, the preparation of a large number of sequencing libraries remains prohibitively expensive for population genetic analyses of nonmodel species. Pooling samples may be then an attractive alternative. To test whether pooled RNA-Seq accurately predicts true allele frequencies, we analysed the liver transcriptomes of 10 bank voles. Each sample was sequenced both as an individually barcoded library and as a part of a pool. Equal amounts of total RNA from each vole were pooled prior to mRNA selection and library construction. Reads were mapped onto the de novo assembled reference transcriptome. High-quality genotypes for individual voles, determined for 23,682 SNPs, provided information on 'true' allele frequencies; allele frequencies estimated from the pool were then compared with these values. 'True' frequencies and those estimated from the pool were highly correlated. Mean relative estimation error was 21% and did not depend on expression level. However, we also observed a minor effect of interindividual variation in gene expression and allele-specific gene expression influencing allele frequency estimation accuracy. Moreover, we observed strong negative relationship between minor allele frequency and relative estimation error. Our results indicate that pooled RNA-Seq exhibits accuracy comparable with pooled genome resequencing, but variation in expression level between individuals should be assessed and accounted for. This should help in taking account the difference in accuracy between conservatively expressed transcripts and these which are variable in expression level.
Collapse
Affiliation(s)
- M Konczal
- Institute of Environmental Sciences, Jagiellonian University, Gronostajowa 7, 30-387, Kraków, Poland
| | | | | | | | | |
Collapse
|
809
|
Lopez-Doriga A, Feliubadaló L, Menéndez M, Lopez-Doriga S, Morón-Duran FD, del Valle J, Tornero E, Montes E, Cuesta R, Campos O, Gómez C, Pineda M, González S, Moreno V, Capellá G, Lázaro C. ICO amplicon NGS data analysis: a Web tool for variant detection in common high-risk hereditary cancer genes analyzed by amplicon GS Junior next-generation sequencing. Hum Mutat 2013; 35:271-7. [PMID: 24227591 DOI: 10.1002/humu.22484] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2013] [Accepted: 11/07/2013] [Indexed: 12/19/2022]
Abstract
Next-generation sequencing (NGS) has revolutionized genomic research and is set to have a major impact on genetic diagnostics thanks to the advent of benchtop sequencers and flexible kits for targeted libraries. Among the main hurdles in NGS are the difficulty of performing bioinformatic analysis of the huge volume of data generated and the high number of false positive calls that could be obtained, depending on the NGS technology and the analysis pipeline. Here, we present the development of a free and user-friendly Web data analysis tool that detects and filters sequence variants, provides coverage information, and allows the user to customize some basic parameters. The tool has been developed to provide accurate genetic analysis of targeted sequencing of common high-risk hereditary cancer genes using amplicon libraries run in a GS Junior System. The Web resource is linked to our own mutation database, to assist in the clinical classification of identified variants. We believe that this tool will greatly facilitate the use of the NGS approach in routine laboratories.
Collapse
|
810
|
Lee YP, Giorgi FM, Lohse M, Kvederaviciute K, Klages S, Usadel B, Meskiene I, Reinhardt R, Hincha DK. Transcriptome sequencing and microarray design for functional genomics in the extremophile Arabidopsis relative Thellungiella salsuginea (Eutrema salsugineum). BMC Genomics 2013; 14:793. [PMID: 24228715 PMCID: PMC3832907 DOI: 10.1186/1471-2164-14-793] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2012] [Accepted: 11/11/2013] [Indexed: 11/29/2022] Open
Abstract
Background Most molecular studies of plant stress tolerance have been performed with Arabidopsis thaliana, although it is not particularly stress tolerant and may lack protective mechanisms required to survive extreme environmental conditions. Thellungiella salsuginea has attracted interest as an alternative plant model species with high tolerance of various abiotic stresses. While the T. salsuginea genome has recently been sequenced, its annotation is still incomplete and transcriptomic information is scarce. In addition, functional genomics investigations in this species are severely hampered by a lack of affordable tools for genome-wide gene expression studies. Results Here, we report the results of Thellungiella de novo transcriptome assembly and annotation based on 454 pyrosequencing and development and validation of a T. salsuginea microarray. ESTs were generated from a non-normalized and a normalized library synthesized from RNA pooled from samples covering different tissues and abiotic stress conditions. Both libraries yielded partially unique sequences, indicating their necessity to obtain comprehensive transcriptome coverage. More than 1 million sequence reads were assembled into 42,810 unigenes, approximately 50% of which could be functionally annotated. These unigenes were compared to all available Thellungiella genome sequence information. In addition, the groups of Late Embryogenesis Abundant (LEA) proteins, Mitogen Activated Protein (MAP) kinases and protein phosphatases were annotated in detail. We also predicted the target genes for 384 putative miRNAs. From the sequence information, we constructed a 44 k Agilent oligonucleotide microarray. Comparison of same-species and cross-species hybridization results showed superior performance of the newly designed array for T. salsuginea samples. The developed microarrays were used to investigate transcriptional responses of T. salsuginea and Arabidopsis during cold acclimation using the MapMan software. Conclusions This study provides the first comprehensive transcriptome information for the extremophile Arabidopsis relative T. salsuginea. The data constitute a more than three-fold increase in the number of publicly available unigene sequences and will greatly facilitate genome annotation. In addition, we have designed and validated the first genome-wide microarray for T. salsuginea, which will be commercially available. Together with the publicly available MapMan software this will become an important tool for functional genomics of plant stress tolerance.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | - Dirk K Hincha
- Max-Planck-Institut für Molekulare Pflanzenphysiologie, Am Mühlenberg 1, D-14476 Potsdam, Germany.
| |
Collapse
|
811
|
Kukimoto I, Maehama T, Sekizuka T, Ogasawara Y, Kondo K, Kusumoto-Matsuo R, Mori S, Ishii Y, Takeuchi T, Yamaji T, Takeuchi F, Hanada K, Kuroda M. Genetic variation of human papillomavirus type 16 in individual clinical specimens revealed by deep sequencing. PLoS One 2013; 8:e80583. [PMID: 24236186 PMCID: PMC3827439 DOI: 10.1371/journal.pone.0080583] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2013] [Accepted: 10/04/2013] [Indexed: 01/10/2023] Open
Abstract
Viral genetic diversity within infected cells or tissues, called viral quasispecies, has been mostly studied for RNA viruses, but has also been described among DNA viruses, including human papillomavirus type 16 (HPV16) present in cervical precancerous lesions. However, the extent of HPV genetic variation in cervical specimens, and its involvement in HPV-induced carcinogenesis, remains unclear. Here, we employ deep sequencing to comprehensively analyze genetic variation in the HPV16 genome isolated from individual clinical specimens. Through overlapping full-circle PCR, approximately 8-kb DNA fragments covering the whole HPV16 genome were amplified from HPV16-positive cervical exfoliated cells collected from patients with either low-grade squamous intraepithelial lesion (LSIL) or invasive cervical cancer (ICC). Deep sequencing of the amplified HPV16 DNA enabled de novo assembly of the full-length HPV16 genome sequence for each of 7 specimens (5 LSIL and 2 ICC samples). Subsequent alignment of read sequences to the assembled HPV16 sequence revealed that 2 LSILs and 1 ICC contained nucleotide variations within E6, E1 and the non-coding region between E5 and L2 with mutation frequencies of 0.60% to 5.42%. In transient replication assays, a novel E1 mutant found in ICC, E1 Q381E, showed reduced ability to support HPV16 origin-dependent replication. In addition, partially deleted E2 genes were detected in 1 LSIL sample in a mixed state with the intact E2 gene. Thus, the methods used in this study provide a fundamental framework for investigating the influence of HPV somatic genetic variation on cervical carcinogenesis.
Collapse
Affiliation(s)
- Iwao Kukimoto
- Pathogen Genomics Center, National Institute of Infectious Diseases, Tokyo, Japan
- * E-mail:
| | - Tomohiko Maehama
- Department of Biochemistry and Cell Biology, National Institute of Infectious Diseases, Tokyo, Japan
| | - Tsuyoshi Sekizuka
- Pathogen Genomics Center, National Institute of Infectious Diseases, Tokyo, Japan
| | - Yumiko Ogasawara
- Pathogen Genomics Center, National Institute of Infectious Diseases, Tokyo, Japan
| | | | - Rika Kusumoto-Matsuo
- Pathogen Genomics Center, National Institute of Infectious Diseases, Tokyo, Japan
| | - Seiichiro Mori
- Pathogen Genomics Center, National Institute of Infectious Diseases, Tokyo, Japan
| | - Yoshiyuki Ishii
- Pathogen Genomics Center, National Institute of Infectious Diseases, Tokyo, Japan
| | - Takamasa Takeuchi
- Pathogen Genomics Center, National Institute of Infectious Diseases, Tokyo, Japan
| | - Toshiyuki Yamaji
- Department of Biochemistry and Cell Biology, National Institute of Infectious Diseases, Tokyo, Japan
| | - Fumihiko Takeuchi
- Pathogen Genomics Center, National Institute of Infectious Diseases, Tokyo, Japan
| | - Kentaro Hanada
- Department of Biochemistry and Cell Biology, National Institute of Infectious Diseases, Tokyo, Japan
| | - Makoto Kuroda
- Pathogen Genomics Center, National Institute of Infectious Diseases, Tokyo, Japan
| |
Collapse
|
812
|
Pritchard CC, Salipante SJ, Koehler K, Smith C, Scroggins S, Wood B, Wu D, Lee MK, Dintzis S, Adey A, Liu Y, Eaton KD, Martins R, Stricker K, Margolin KA, Hoffman N, Churpek JE, Tait JF, King MC, Walsh T. Validation and implementation of targeted capture and sequencing for the detection of actionable mutation, copy number variation, and gene rearrangement in clinical cancer specimens. J Mol Diagn 2013; 16:56-67. [PMID: 24189654 DOI: 10.1016/j.jmoldx.2013.08.004] [Citation(s) in RCA: 216] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2013] [Revised: 06/25/2013] [Accepted: 08/07/2013] [Indexed: 11/15/2022] Open
Abstract
Recent years have seen development and implementation of anticancer therapies targeted to particular gene mutations, but methods to assay clinical cancer specimens in a comprehensive way for the critical mutations remain underdeveloped. We have developed UW-OncoPlex, a clinical molecular diagnostic assay to provide simultaneous deep-sequencing information, based on >500× average coverage, for all classes of mutations in 194 clinically relevant genes. To validate UW-OncoPlex, we tested 98 previously characterized clinical tumor specimens from 10 different cancer types, including 41 formalin-fixed paraffin-embedded tissue samples. Mixing studies indicated reliable mutation detection in samples with ≥ 10% tumor cells. In clinical samples with ≥ 10% tumor cells, UW-OncoPlex correctly identified 129 of 130 known mutations [sensitivity 99.2%, (95% CI, 95.8%-99.9%)], including single nucleotide variants, small insertions and deletions, internal tandem duplications, gene copy number gains and amplifications, gene copy losses, chromosomal gains and losses, and actionable genomic rearrangements, including ALK-EML4, ROS1, PML-RARA, and BCR-ABL. In the same samples, the assay also identified actionable point mutations in genes not previously analyzed and novel gene rearrangements of MLL and GRIK4 in melanoma, and of ASXL1, PIK3R1, and SGCZ in acute myeloid leukemia. To best guide existing and emerging treatment regimens and facilitate integration of genomic testing with patient care, we developed a framework for data analysis, decision support, and reporting clinically actionable results.
Collapse
Affiliation(s)
- Colin C Pritchard
- Department of Laboratory Medicine, University of Washington, Seattle, Washington.
| | - Stephen J Salipante
- Department of Laboratory Medicine, University of Washington, Seattle, Washington; Department of Genome Sciences, University of Washington, Seattle, Washington
| | - Karen Koehler
- Department of Laboratory Medicine, University of Washington, Seattle, Washington
| | - Christina Smith
- Department of Laboratory Medicine, University of Washington, Seattle, Washington
| | - Sheena Scroggins
- Department of Laboratory Medicine, University of Washington, Seattle, Washington
| | - Brent Wood
- Department of Laboratory Medicine, University of Washington, Seattle, Washington; Department of Pathology, University of Washington, Seattle, Washington
| | - David Wu
- Department of Laboratory Medicine, University of Washington, Seattle, Washington
| | - Ming K Lee
- Division of Medical Genetics, Department of Medicine, University of Washington, Seattle, Washington
| | - Suzanne Dintzis
- Department of Pathology, University of Washington, Seattle, Washington
| | - Andrew Adey
- Department of Genome Sciences, University of Washington, Seattle, Washington
| | - Yajuan Liu
- Department of Pathology, University of Washington, Seattle, Washington
| | - Keith D Eaton
- Division of Oncology, Department of Medicine, University of Washington, Seattle, Washington
| | - Renato Martins
- Division of Oncology, Department of Medicine, University of Washington, Seattle, Washington
| | - Kari Stricker
- Division of Oncology, Department of Medicine, University of Washington, Seattle, Washington
| | - Kim A Margolin
- Division of Oncology, Department of Medicine, University of Washington, Seattle, Washington
| | - Noah Hoffman
- Department of Laboratory Medicine, University of Washington, Seattle, Washington
| | - Jane E Churpek
- Section of Hematology and Oncology, Department of Medicine, University of Chicago, Chicago, Illinois
| | - Jonathan F Tait
- Department of Laboratory Medicine, University of Washington, Seattle, Washington
| | - Mary-Claire King
- Department of Genome Sciences, University of Washington, Seattle, Washington; Division of Medical Genetics, Department of Medicine, University of Washington, Seattle, Washington
| | - Tom Walsh
- Division of Medical Genetics, Department of Medicine, University of Washington, Seattle, Washington
| |
Collapse
|
813
|
Dutton-Regester K, Kakavand H, Aoude LG, Stark MS, Gartside MG, Johansson P, O'Connor L, Lanagan C, Tembe V, Pupo GM, Haydu LE, Schmidt CW, Mann GJ, Thompson JF, Scolyer RA, Hayward NK. Melanomas of unknown primary have a mutation profile consistent with cutaneous sun-exposed melanoma. Pigment Cell Melanoma Res 2013; 26:852-60. [PMID: 23890154 DOI: 10.1111/pcmr.12153] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2013] [Accepted: 07/19/2013] [Indexed: 11/29/2022]
Abstract
Melanoma of unknown primary (MUP) is an uncommon phenomenon whereby patients present with metastatic disease without an evident primary site. To determine their likely site of origin, we combined exome sequencing from 33 MUPs to assess the total rate of somatic mutations and degree of UV mutagenesis. An independent cohort of 91 archival MUPs was also screened for 46 hot spot mutations highly prevalent in melanoma including BRAF, NRAS, KIT, GNAQ, and GNA11. Results showed that the majority of MUPs exhibited high somatic mutation rates, high ratios of C>T/G>A transitions, and a high rate of BRAF (45 of 101, 45%) and NRAS (32 of 101, 32%) mutations, collectively indicating a mutation profile consistent with cutaneous sun-exposed melanomas. These data suggest that a significant proportion of MUPs arise from regressed or unrecognized primary cutaneous melanomas or arise de novo in lymph nodes from nevus cells that have migrated from the skin.
Collapse
Affiliation(s)
- Ken Dutton-Regester
- Oncogenomics Laboratory, QIMR Berghofer Medical Research Institute, Brisbane, QLD, Australia
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
814
|
Zheng Z, Geng J, Yao RE, Li C, Ying D, Shen Y, Ying L, Yu Y, Fu Q. Molecular defects identified by whole exome sequencing in a child with Fanconi anemia. Gene 2013; 530:295-300. [DOI: 10.1016/j.gene.2013.08.031] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2013] [Revised: 08/01/2013] [Accepted: 08/09/2013] [Indexed: 01/25/2023]
|
815
|
Ferretti L, Ramos-Onsins SE, Pérez-Enciso M. Population genomics from pool sequencing. Mol Ecol 2013; 22:5561-76. [DOI: 10.1111/mec.12522] [Citation(s) in RCA: 109] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2011] [Revised: 08/03/2013] [Accepted: 09/06/2013] [Indexed: 11/30/2022]
Affiliation(s)
- Luca Ferretti
- Center for Research in Agricultural Genomics (CRAG); UAB 08193 Bellaterra Spain
| | | | - Miguel Pérez-Enciso
- Center for Research in Agricultural Genomics (CRAG); UAB 08193 Bellaterra Spain
- Department of Animal Science and Food; Faculty of Veterinary; Universitat Autonoma de Barcelona; 08193 Bellaterra Spain
- Institut Català de Recerca i Estudis Avancats (ICREA); Passeig Lluís Companys 23 08010 Barcelona Spain
| |
Collapse
|
816
|
Worthey EA. Analysis and annotation of whole-genome or whole-exome sequencing-derived variants for clinical diagnosis. CURRENT PROTOCOLS IN HUMAN GENETICS 2013; 79:9.24.1-9.24.24. [PMID: 24510652 DOI: 10.1002/0471142905.hg0924s79] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
Over the last several years, next-generation sequencing (NGS) has transformed genomic research through substantial advances in technology and reduction in the cost of sequencing, and also in the systems required for analysis of these large volumes of data. This technology is now being used as a standard molecular diagnostic test under particular circumstances in some clinical settings. The advances in sequencing have come so rapidly that the major bottleneck in identification of causal variants is no longer the sequencing but rather the analysis and interpretation. Interpretation of genetic findings in a clinical setting is scarcely a new challenge, but the task is increasingly complex in clinical genome-wide sequencing given the dramatic increase in dataset size and complexity. This increase requires the development of novel or repositioned analysis tools, methodologies, and processes. This unit provides an overview of these items. Specific challenges related to implementation in a clinical setting are discussed.
Collapse
Affiliation(s)
- Elizabeth A Worthey
- Department of Pediatrics, Medical College of Wisconsin, Milwaukee, Wisconsin.,The Human and Molecular Genetics Center, Medical College of Wisconsin, Milwaukee, Wisconsin.,Department of Computer Science, University of Wisconsin, Milwaukee, Wisconsin
| |
Collapse
|
817
|
Barturen G, Rueda A, Oliver JL, Hackenberg M. MethylExtract: High-Quality methylation maps and SNV calling from whole genome bisulfite sequencing data. F1000Res 2013; 2:217. [PMID: 24627790 PMCID: PMC3938178 DOI: 10.12688/f1000research.2-217.v2] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 02/19/2014] [Indexed: 01/10/2023] Open
Abstract
Whole genome methylation profiling at a single cytosine resolution is now feasible due to the advent of high-throughput sequencing techniques together with bisulfite treatment of the DNA. To obtain the methylation value of each individual cytosine, the bisulfite-treated sequence reads are first aligned to a reference genome, and then the profiling of the methylation levels is done from the alignments. A huge effort has been made to quickly and correctly align the reads and many different algorithms and programs to do this have been created. However, the second step is just as crucial and non-trivial, but much less attention has been paid to the final inference of the methylation states. Important error sources do exist, such as sequencing errors, bisulfite failure, clonal reads, and single nucleotide variants. We developed
MethylExtract, a user friendly tool to: i) generate high quality, whole genome methylation maps and ii) detect sequence variation within the same sample preparation. The program is implemented into a single script and takes into account all major error sources.
MethylExtract detects variation (SNVs – Single Nucleotide Variants) in a similar way to
VarScan, a very sensitive method extensively used in SNV and genotype calling based on non-bisulfite-treated reads. The usefulness of
MethylExtract is shown by means of extensive benchmarking based on artificial bisulfite-treated reads and a comparison to a recently published method, called
Bis-SNP. MethylExtract is able to detect SNVs within High-Throughput Sequencing experiments of bisulfite treated DNA at the same time as it generates high quality methylation maps. This simultaneous detection of DNA methylation and sequence variation is crucial for many downstream analyses, for example when deciphering the impact of SNVs on differential methylation. An exclusive feature of
MethylExtract, in comparison with existing software, is the possibility to assess the bisulfite failure in a statistical way. The source code, tutorial and artificial bisulfite datasets are available at
http://bioinfo2.ugr.es/MethylExtract/ and
http://sourceforge.net/projects/methylextract/, and also permanently accessible from
10.5281/zenodo.7144.
Collapse
Affiliation(s)
- Guillermo Barturen
- Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, Granada, 18071, Spain ; Lab. de Bioinformática, Inst. de Biotecnología, Centro de Investigación Biomédica, Granada, 18016, Spain
| | - Antonio Rueda
- Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, Granada, 18071, Spain ; Lab. de Bioinformática, Inst. de Biotecnología, Centro de Investigación Biomédica, Granada, 18016, Spain
| | - José L Oliver
- Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, Granada, 18071, Spain ; Lab. de Bioinformática, Inst. de Biotecnología, Centro de Investigación Biomédica, Granada, 18016, Spain
| | - Michael Hackenberg
- Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, Granada, 18071, Spain ; Lab. de Bioinformática, Inst. de Biotecnología, Centro de Investigación Biomédica, Granada, 18016, Spain
| |
Collapse
|
818
|
Barturen G, Rueda A, Oliver JL, Hackenberg M. MethylExtract: High-Quality methylation maps and SNV calling from whole genome bisulfite sequencing data. F1000Res 2013; 2:217. [PMID: 24627790 DOI: 10.12688/f1000research.2-217.v1] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 10/09/2013] [Indexed: 01/30/2023] Open
Abstract
Whole genome methylation profiling at a single cytosine resolution is now feasible due to the advent of high-throughput sequencing techniques together with bisulfite treatment of the DNA. To obtain the methylation value of each individual cytosine, the bisulfite-treated sequence reads are first aligned to a reference genome, and then the profiling of the methylation levels is done from the alignments. A huge effort has been made to quickly and correctly align the reads and many different algorithms and programs to do this have been created. However, the second step is just as crucial and non-trivial, but much less attention has been paid to the final inference of the methylation states. Important error sources do exist, such as sequencing errors, bisulfite failure, clonal reads, and single nucleotide variants. We developed MethylExtract, a user friendly tool to: i) generate high quality, whole genome methylation maps and ii) detect sequence variation within the same sample preparation. The program is implemented into a single script and takes into account all major error sources. MethylExtract detects variation (SNVs - Single Nucleotide Variants) in a similar way to VarScan, a very sensitive method extensively used in SNV and genotype calling based on non-bisulfite-treated reads. The usefulness of MethylExtract is shown by means of extensive benchmarking based on artificial bisulfite-treated reads and a comparison to a recently published method, called Bis-SNP. MethylExtract is able to detect SNVs within High-Throughput Sequencing experiments of bisulfite treated DNA at the same time as it generates high quality methylation maps. This simultaneous detection of DNA methylation and sequence variation is crucial for many downstream analyses, for example when deciphering the impact of SNVs on differential methylation. An exclusive feature of MethylExtract, in comparison with existing software, is the possibility to assess the bisulfite failure in a statistical way. The source code, tutorial and artificial bisulfite datasets are available at http://bioinfo2.ugr.es/MethylExtract/ and http://sourceforge.net/projects/methylextract/, and also permanently accessible from 10.5281/zenodo.7144.
Collapse
Affiliation(s)
- Guillermo Barturen
- Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, Granada, 18071, Spain ; Lab. de Bioinformática, Inst. de Biotecnología, Centro de Investigación Biomédica, Granada, 18016, Spain
| | - Antonio Rueda
- Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, Granada, 18071, Spain ; Lab. de Bioinformática, Inst. de Biotecnología, Centro de Investigación Biomédica, Granada, 18016, Spain
| | - José L Oliver
- Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, Granada, 18071, Spain ; Lab. de Bioinformática, Inst. de Biotecnología, Centro de Investigación Biomédica, Granada, 18016, Spain
| | - Michael Hackenberg
- Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, Granada, 18071, Spain ; Lab. de Bioinformática, Inst. de Biotecnología, Centro de Investigación Biomédica, Granada, 18016, Spain
| |
Collapse
|
819
|
Whole-genome and whole-exome sequencing of bladder cancer identifies frequent alterations in genes involved in sister chromatid cohesion and segregation. Nat Genet 2013; 45:1459-63. [PMID: 24121792 DOI: 10.1038/ng.2798] [Citation(s) in RCA: 369] [Impact Index Per Article: 30.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2013] [Accepted: 09/16/2013] [Indexed: 12/15/2022]
Abstract
Bladder cancer is one of the most common cancers worldwide, with transitional cell carcinoma (TCC) being the predominant form. Here we report a genomic analysis of TCC by both whole-genome and whole-exome sequencing of 99 individuals with TCC. Beyond confirming recurrent mutations in genes previously identified as being mutated in TCC, we identified additional altered genes and pathways that were implicated in TCC. Notably, we discovered frequent alterations in STAG2 and ESPL1, two genes involved in the sister chromatid cohesion and segregation (SCCS) process. Furthermore, we also detected a recurrent fusion involving FGFR3 and TACC3, another component of SCCS, by transcriptome sequencing of 42 DNA-sequenced tumors. Overall, 32 of the 99 tumors (32%) harbored genetic alterations in the SCCS process. Our analysis provides evidence that genetic alterations affecting the SCCS process may be involved in bladder tumorigenesis and identifies a new therapeutic possibility for bladder cancer.
Collapse
|
820
|
Kosugi S, Natsume S, Yoshida K, MacLean D, Cano L, Kamoun S, Terauchi R. Coval: improving alignment quality and variant calling accuracy for next-generation sequencing data. PLoS One 2013; 8:e75402. [PMID: 24116042 PMCID: PMC3792961 DOI: 10.1371/journal.pone.0075402] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2013] [Accepted: 08/14/2013] [Indexed: 11/26/2022] Open
Abstract
Accurate identification of DNA polymorphisms using next-generation sequencing technology is challenging because of a high rate of sequencing error and incorrect mapping of reads to reference genomes. Currently available short read aligners and DNA variant callers suffer from these problems. We developed the Coval software to improve the quality of short read alignments. Coval is designed to minimize the incidence of spurious alignment of short reads, by filtering mismatched reads that remained in alignments after local realignment and error correction of mismatched reads. The error correction is executed based on the base quality and allele frequency at the non-reference positions for an individual or pooled sample. We demonstrated the utility of Coval by applying it to simulated genomes and experimentally obtained short-read data of rice, nematode, and mouse. Moreover, we found an unexpectedly large number of incorrectly mapped reads in ‘targeted’ alignments, where the whole genome sequencing reads had been aligned to a local genomic segment, and showed that Coval effectively eliminated such spurious alignments. We conclude that Coval significantly improves the quality of short-read sequence alignments, thereby increasing the calling accuracy of currently available tools for SNP and indel identification. Coval is available at http://sourceforge.net/projects/coval105/.
Collapse
Affiliation(s)
- Shunichi Kosugi
- Iwate Biotechnology Research Center, Kitakami, Iwate, Japan
- Kazusa DNA Research Institute, Kisarazu, Chiba, Japan
- * E-mail: (SK); (RT)
| | | | | | - Daniel MacLean
- The Sainsbury Laboratory, Norwich Research Park, Norwich, United Kingdom
| | - Liliana Cano
- The Sainsbury Laboratory, Norwich Research Park, Norwich, United Kingdom
| | - Sophien Kamoun
- The Sainsbury Laboratory, Norwich Research Park, Norwich, United Kingdom
| | - Ryohei Terauchi
- Iwate Biotechnology Research Center, Kitakami, Iwate, Japan
- * E-mail: (SK); (RT)
| |
Collapse
|
821
|
Cruceanu C, Ambalavanan A, Spiegelman D, Gauthier J, Lafrenière RG, Dion PA, Alda M, Turecki G, Rouleau GA. Family-based exome-sequencing approach identifies rare susceptibility variants for lithium-responsive bipolar disorder. Genome 2013; 56:634-40. [DOI: 10.1139/gen-2013-0081] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
Bipolar disorder (BD) is a psychiatric condition characterized by the occurrence of at least two episodes of clinically disturbed mood including mania and depression. A vast literature describing BD studies suggests that a strong genetic contribution likely underlies this condition; heritability is estimated to be as high as 80%. Many studies have identified BD susceptibility loci, but because of the genetic and phenotypic heterogeneity observed across individuals, very few loci were subsequently replicated. Research in BD genetics to date has consisted of classical linkage or genome-wide association studies, which have identified candidate genes hypothesized to present common susceptibility variants. Although the observation of such common variants is informative, they can only explain a small fraction of the predicted BD heritability, suggesting a considerable contribution would come from rare and highly penetrant variants. We are seeking to identify such rare variants, and to increase the likelihood of being successful, we aimed to reduce the phenotypic heterogeneity factor by focusing on a well-defined subphenotype of BD: excellent response to lithium monotherapy. Our group has previously shown positive response to lithium therapy clusters in families and has a consistent clinical presentation with minimal comorbidity. To identify such rare variants, we are using a targeted exome capture and high-throughput DNA sequencing approach, and analyzing the entire coding sequences of BD affected individuals from multigenerational families. We are prioritizing rare variants with a frequency of less than 1% in the population that segregate with affected status within each family, as well as being potentially highly penetrant (e.g., protein truncating, missense, or frameshift) or functionally relevant (e.g., 3′UTR, 5′UTR, or splicing). By focusing on rare variants in a familial cohort, we hope to explain a significant portion of the missing heritability in BD, as well as to narrow our current insight on the key biochemical pathways implicated in this complex disorder.
Collapse
Affiliation(s)
- Cristiana Cruceanu
- Department of Human Genetics, McGill University, Montréal, QC, Canada
- McGill Group for Suicide Studies, McGill University, Montréal, QC, Canada
| | - Amirthagowri Ambalavanan
- Department of Human Genetics, McGill University, Montréal, QC, Canada
- Center of Excellence in Neuroscience of the Université de Montréal-CENUM, Centre de Recherche du Centre Hospitalier de l’Université de Montréal-CRCHUM, University of Montreal, Montréal, QC, Canada
| | - Dan Spiegelman
- Center of Excellence in Neuroscience of the Université de Montréal-CENUM, Centre de Recherche du Centre Hospitalier de l’Université de Montréal-CRCHUM, University of Montreal, Montréal, QC, Canada
| | - Julie Gauthier
- Center of Excellence in Neuroscience of the Université de Montréal-CENUM, Centre de Recherche du Centre Hospitalier de l’Université de Montréal-CRCHUM, University of Montreal, Montréal, QC, Canada
| | - Ronald G. Lafrenière
- Center of Excellence in Neuroscience of the Université de Montréal-CENUM, Centre de Recherche du Centre Hospitalier de l’Université de Montréal-CRCHUM, University of Montreal, Montréal, QC, Canada
| | - Patrick A. Dion
- Center of Excellence in Neuroscience of the Université de Montréal-CENUM, Centre de Recherche du Centre Hospitalier de l’Université de Montréal-CRCHUM, University of Montreal, Montréal, QC, Canada
| | - Martin Alda
- Department of Psychiatry, Dalhousie University, Halifax, NS, Canada
| | - Gustavo Turecki
- Department of Human Genetics, McGill University, Montréal, QC, Canada
- McGill Group for Suicide Studies, McGill University, Montréal, QC, Canada
| | - Guy A. Rouleau
- Department of Human Genetics, McGill University, Montréal, QC, Canada
- Center of Excellence in Neuroscience of the Université de Montréal-CENUM, Centre de Recherche du Centre Hospitalier de l’Université de Montréal-CRCHUM, University of Montreal, Montréal, QC, Canada
- Montreal Neurological Institute and Hospital, McGill University, Montréal, QC, Canada
| |
Collapse
|
822
|
Cabanski CR, Wilkerson MD, Soloway M, Parker JS, Liu J, Prins JF, Marron JS, Perou CM, Hayes DN. BlackOPs: increasing confidence in variant detection through mappability filtering. Nucleic Acids Res 2013; 41:e178. [PMID: 23935067 PMCID: PMC3799449 DOI: 10.1093/nar/gkt692] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2013] [Revised: 06/24/2013] [Accepted: 07/16/2013] [Indexed: 01/05/2023] Open
Abstract
Identifying variants using high-throughput sequencing data is currently a challenge because true biological variants can be indistinguishable from technical artifacts. One source of technical artifact results from incorrectly aligning experimentally observed sequences to their true genomic origin ('mismapping') and inferring differences in mismapped sequences to be true variants. We developed BlackOPs, an open-source tool that simulates experimental RNA-seq and DNA whole exome sequences derived from the reference genome, aligns these sequences by custom parameters, detects variants and outputs a blacklist of positions and alleles caused by mismapping. Blacklists contain thousands of artifact variants that are indistinguishable from true variants and, for a given sample, are expected to be almost completely false positives. We show that these blacklist positions are specific to the alignment algorithm and read length used, and BlackOPs allows users to generate a blacklist specific to their experimental setup. We queried the dbSNP and COSMIC variant databases and found numerous variants indistinguishable from mapping errors. We demonstrate how filtering against blacklist positions reduces the number of potential false variants using an RNA-seq glioblastoma cell line data set. In summary, accounting for mapping-caused variants tuned to experimental setups reduces false positives and, therefore, improves genome characterization by high-throughput sequencing.
Collapse
Affiliation(s)
- Christopher R. Cabanski
- Department of Statistics and Operations Research, University of North Carolina, Chapel Hill, NC 27599, USA, The Genome Institute at Washington University, St. Louis, MO 63108, USA, Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC 27599, USA, Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA, Department of Computer Science, University of Kentucky, Lexington, KY 40506, USA, Department of Computer Science, University of North Carolina, Chapel Hill, NC 27599, USA and Division of Medical Oncology, Department of Internal Medicine, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Matthew D. Wilkerson
- Department of Statistics and Operations Research, University of North Carolina, Chapel Hill, NC 27599, USA, The Genome Institute at Washington University, St. Louis, MO 63108, USA, Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC 27599, USA, Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA, Department of Computer Science, University of Kentucky, Lexington, KY 40506, USA, Department of Computer Science, University of North Carolina, Chapel Hill, NC 27599, USA and Division of Medical Oncology, Department of Internal Medicine, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Matthew Soloway
- Department of Statistics and Operations Research, University of North Carolina, Chapel Hill, NC 27599, USA, The Genome Institute at Washington University, St. Louis, MO 63108, USA, Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC 27599, USA, Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA, Department of Computer Science, University of Kentucky, Lexington, KY 40506, USA, Department of Computer Science, University of North Carolina, Chapel Hill, NC 27599, USA and Division of Medical Oncology, Department of Internal Medicine, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Joel S. Parker
- Department of Statistics and Operations Research, University of North Carolina, Chapel Hill, NC 27599, USA, The Genome Institute at Washington University, St. Louis, MO 63108, USA, Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC 27599, USA, Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA, Department of Computer Science, University of Kentucky, Lexington, KY 40506, USA, Department of Computer Science, University of North Carolina, Chapel Hill, NC 27599, USA and Division of Medical Oncology, Department of Internal Medicine, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Jinze Liu
- Department of Statistics and Operations Research, University of North Carolina, Chapel Hill, NC 27599, USA, The Genome Institute at Washington University, St. Louis, MO 63108, USA, Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC 27599, USA, Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA, Department of Computer Science, University of Kentucky, Lexington, KY 40506, USA, Department of Computer Science, University of North Carolina, Chapel Hill, NC 27599, USA and Division of Medical Oncology, Department of Internal Medicine, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Jan F. Prins
- Department of Statistics and Operations Research, University of North Carolina, Chapel Hill, NC 27599, USA, The Genome Institute at Washington University, St. Louis, MO 63108, USA, Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC 27599, USA, Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA, Department of Computer Science, University of Kentucky, Lexington, KY 40506, USA, Department of Computer Science, University of North Carolina, Chapel Hill, NC 27599, USA and Division of Medical Oncology, Department of Internal Medicine, University of North Carolina, Chapel Hill, NC 27599, USA
| | - J. S. Marron
- Department of Statistics and Operations Research, University of North Carolina, Chapel Hill, NC 27599, USA, The Genome Institute at Washington University, St. Louis, MO 63108, USA, Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC 27599, USA, Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA, Department of Computer Science, University of Kentucky, Lexington, KY 40506, USA, Department of Computer Science, University of North Carolina, Chapel Hill, NC 27599, USA and Division of Medical Oncology, Department of Internal Medicine, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Charles M. Perou
- Department of Statistics and Operations Research, University of North Carolina, Chapel Hill, NC 27599, USA, The Genome Institute at Washington University, St. Louis, MO 63108, USA, Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC 27599, USA, Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA, Department of Computer Science, University of Kentucky, Lexington, KY 40506, USA, Department of Computer Science, University of North Carolina, Chapel Hill, NC 27599, USA and Division of Medical Oncology, Department of Internal Medicine, University of North Carolina, Chapel Hill, NC 27599, USA
| | - D. Neil Hayes
- Department of Statistics and Operations Research, University of North Carolina, Chapel Hill, NC 27599, USA, The Genome Institute at Washington University, St. Louis, MO 63108, USA, Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC 27599, USA, Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA, Department of Computer Science, University of Kentucky, Lexington, KY 40506, USA, Department of Computer Science, University of North Carolina, Chapel Hill, NC 27599, USA and Division of Medical Oncology, Department of Internal Medicine, University of North Carolina, Chapel Hill, NC 27599, USA
| |
Collapse
|
823
|
Yu X, Sun S. Comparing a few SNP calling algorithms using low-coverage sequencing data. BMC Bioinformatics 2013; 14:274. [PMID: 24044377 PMCID: PMC3848615 DOI: 10.1186/1471-2105-14-274] [Citation(s) in RCA: 84] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2013] [Accepted: 09/12/2013] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Many Single Nucleotide Polymorphism (SNP) calling programs have been developed to identify Single Nucleotide Variations (SNVs) in next-generation sequencing (NGS) data. However, low sequencing coverage presents challenges to accurate SNV identification, especially in single-sample data. Moreover, commonly used SNP calling programs usually include several metrics in their output files for each potential SNP. These metrics are highly correlated in complex patterns, making it extremely difficult to select SNPs for further experimental validations. RESULTS To explore solutions to the above challenges, we compare the performance of four SNP calling algorithm, SOAPsnp, Atlas-SNP2, SAMtools, and GATK, in a low-coverage single-sample sequencing dataset. Without any post-output filtering, SOAPsnp calls more SNVs than the other programs since it has fewer internal filtering criteria. Atlas-SNP2 has stringent internal filtering criteria; thus it reports the least number of SNVs. The numbers of SNVs called by GATK and SAMtools fall between SOAPsnp and Atlas-SNP2. Moreover, we explore the values of key metrics related to SNVs' quality in each algorithm and use them as post-output filtering criteria to filter out low quality SNVs. Under different coverage cutoff values, we compare four algorithms and calculate the empirical positive calling rate and sensitivity. Our results show that: 1) the overall agreement of the four calling algorithms is low, especially in non-dbSNPs; 2) the agreement of the four algorithms is similar when using different coverage cutoffs, except that the non-dbSNPs agreement level tends to increase slightly with increasing coverage; 3) SOAPsnp, SAMtools, and GATK have a higher empirical calling rate for dbSNPs compared to non-dbSNPs; and 4) overall, GATK and Atlas-SNP2 have a relatively higher positive calling rate and sensitivity, but GATK calls more SNVs. CONCLUSIONS Our results show that the agreement between different calling algorithms is relatively low. Thus, more caution should be used in choosing algorithms, setting filtering parameters, and designing validation studies. For reliable SNV calling results, we recommend that users employ more than one algorithm and use metrics related to calling quality and coverage as filtering criteria.
Collapse
Affiliation(s)
- Xiaoqing Yu
- Department of Epidemiology and Biostatistics, Case Western Reserve University, Cleveland, Ohio 44106, USA.
| | | |
Collapse
|
824
|
Liu G, Zhang L, Qin Y, Zou G, Li Z, Yan X, Wei X, Chen M, Chen L, Zheng K, Zhang J, Ma L, Li J, Liu R, Xu H, Bao X, Fang X, Wang L, Zhong Y, Liu W, Zheng H, Wang S, Wang C, Xun L, Zhao GP, Wang T, Zhou Z, Qu Y. Long-term strain improvements accumulate mutations in regulatory elements responsible for hyper-production of cellulolytic enzymes. Sci Rep 2013; 3:1569. [PMID: 23535838 PMCID: PMC3610096 DOI: 10.1038/srep01569] [Citation(s) in RCA: 76] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2013] [Accepted: 03/13/2013] [Indexed: 12/12/2022] Open
Abstract
Long-term strain improvements through repeated mutagenesis and screening have generated a hyper-producer of cellulases and hemicellulases from Penicillium decumbens 114 which was isolated 30 years ago. Here, the genome of the hyper-producer P. decumbens JU-A10-T was sequenced and compared with that of the wild-type strain 114-2. Further, the transcriptomes and secretomes were compared between the strains. Selective hyper-production of cellulases and hemicellulases but not all the secreted proteins was observed in the mutant, making it a more specific producer of lignocellulolytic enzymes. Functional analysis identified that changes in several transcriptional regulatory elements played crucial roles in the cellulase hyper-producing characteristics of the mutant. Additionally, the mutant showed enhanced supply of amino acids and decreased synthesis of secondary metabolites compared with the wild-type. The results clearly point out that we can target gene regulators and promoters with minimal alterations of the genetic content but maximal effects in genetic engineering.
Collapse
Affiliation(s)
- Guodong Liu
- State Key Laboratory of Microbial Technology, Shandong University, Jinan, Shandong, China
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
825
|
Peralta M, Combes MC, Cenci A, Lashermes P, Dereeper A. SNiPloid: A Utility to Exploit High-Throughput SNP Data Derived from RNA-Seq in Allopolyploid Species. INTERNATIONAL JOURNAL OF PLANT GENOMICS 2013; 2013:890123. [PMID: 24163691 PMCID: PMC3791807 DOI: 10.1155/2013/890123] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/15/2013] [Revised: 07/26/2013] [Accepted: 07/30/2013] [Indexed: 05/18/2023]
Abstract
High-throughput sequencing is a common approach to discover SNP variants, especially in plant species. However, methods to analyze predicted SNPs are often optimized for diploid plant species whereas many crop species are allopolyploids and combine related but divergent subgenomes (homoeologous chromosome sets). We created a software tool, SNiPloid, that exploits and interprets putative SNPs in the context of allopolyploidy by comparing SNPs from an allopolyploid with those obtained in its modern-day diploid progenitors. SNiPloid can compare SNPs obtained from a sample to estimate the subgenome contribution to the transcriptome or SNPs obtained from two polyploid accessions to search for SNP divergence.
Collapse
Affiliation(s)
- Marine Peralta
- UMR RPB, IRD (Institut de Recherche pour le Développement), 911 Avenue Agropolis, BP 64501, 34394 Montpellier Cedex 5, France
| | - Marie-Christine Combes
- UMR RPB, IRD (Institut de Recherche pour le Développement), 911 Avenue Agropolis, BP 64501, 34394 Montpellier Cedex 5, France
| | - Alberto Cenci
- UMR RPB, IRD (Institut de Recherche pour le Développement), 911 Avenue Agropolis, BP 64501, 34394 Montpellier Cedex 5, France
| | - Philippe Lashermes
- UMR RPB, IRD (Institut de Recherche pour le Développement), 911 Avenue Agropolis, BP 64501, 34394 Montpellier Cedex 5, France
| | - Alexis Dereeper
- UMR RPB, IRD (Institut de Recherche pour le Développement), 911 Avenue Agropolis, BP 64501, 34394 Montpellier Cedex 5, France
| |
Collapse
|
826
|
Strino F, Parisi F, Micsinai M, Kluger Y. TrAp: a tree approach for fingerprinting subclonal tumor composition. Nucleic Acids Res 2013; 41:e165. [PMID: 23892400 PMCID: PMC3783191 DOI: 10.1093/nar/gkt641] [Citation(s) in RCA: 78] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2013] [Revised: 06/11/2013] [Accepted: 07/02/2013] [Indexed: 01/01/2023] Open
Abstract
Revealing the clonal composition of a single tumor is essential for identifying cell subpopulations with metastatic potential in primary tumors or with resistance to therapies in metastatic tumors. Sequencing technologies provide only an overview of the aggregate of numerous cells. Computational approaches to de-mix a collective signal composed of the aberrations of a mixed cell population of a tumor sample into its individual components are not available. We propose an evolutionary framework for deconvolving data from a single genome-wide experiment to infer the composition, abundance and evolutionary paths of the underlying cell subpopulations of a tumor. We have developed an algorithm (TrAp) for solving this mixture problem. In silico analyses show that TrAp correctly deconvolves mixed subpopulations when the number of subpopulations and the measurement errors are moderate. We demonstrate the applicability of the method using tumor karyotypes and somatic hypermutation data sets. We applied TrAp to Exome-Seq experiment of a renal cell carcinoma tumor sample and compared the mutational profile of the inferred subpopulations to the mutational profiles of single cells of the same tumor. Finally, we deconvolve sequencing data from eight acute myeloid leukemia patients and three distinct metastases of one melanoma patient to exhibit the evolutionary relationships of their subpopulations.
Collapse
Affiliation(s)
- Francesco Strino
- Department of Pathology, Yale University School of Medicine, New Haven, CT 06520, USA, NYU Center for Health Informatics and Bioinformatics, New York University Langone Medical Center, 227 East 30th Street, New York, NY 10016, USA and Yale Cancer Center, New Haven, CT 06520, USA
| | - Fabio Parisi
- Department of Pathology, Yale University School of Medicine, New Haven, CT 06520, USA, NYU Center for Health Informatics and Bioinformatics, New York University Langone Medical Center, 227 East 30th Street, New York, NY 10016, USA and Yale Cancer Center, New Haven, CT 06520, USA
| | - Mariann Micsinai
- Department of Pathology, Yale University School of Medicine, New Haven, CT 06520, USA, NYU Center for Health Informatics and Bioinformatics, New York University Langone Medical Center, 227 East 30th Street, New York, NY 10016, USA and Yale Cancer Center, New Haven, CT 06520, USA
| | - Yuval Kluger
- Department of Pathology, Yale University School of Medicine, New Haven, CT 06520, USA, NYU Center for Health Informatics and Bioinformatics, New York University Langone Medical Center, 227 East 30th Street, New York, NY 10016, USA and Yale Cancer Center, New Haven, CT 06520, USA
| |
Collapse
|
827
|
Kim S, Jeong K, Bhutani K, Lee JH, Patel A, Scott E, Nam H, Lee H, Gleeson JG, Bafna V. Virmid: accurate detection of somatic mutations with sample impurity inference. Genome Biol 2013; 14:R90. [PMID: 23987214 PMCID: PMC4054681 DOI: 10.1186/gb-2013-14-8-r90] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2013] [Revised: 07/17/2013] [Accepted: 08/29/2013] [Indexed: 11/10/2022] Open
Abstract
Detection of somatic variation using sequence from disease-control matched data sets is a critical first step. In many cases including cancer, however, it is hard to isolate pure disease tissue, and the impurity hinders accurate mutation analysis by disrupting overall allele frequencies. Here, we propose a new method, Virmid, that explicitly determines the level of impurity in the sample, and uses it for improved detection of somatic variation. Extensive tests on simulated and real sequencing data from breast cancer and hemimegalencephaly demonstrate the power of our model. A software implementation of our method is available at http://sourceforge.net/projects/virmid/.
Collapse
Affiliation(s)
- Sangwoo Kim
- Department of Computer Science and Engineering, University of California at San Diego, 9500 Gilman Drive, La Jolla, CA 92093, USA
| | - Kyowon Jeong
- Department of Electrical and Computer Engineering, University of California at San Diego, 9500 Gilman Drive, La Jolla, CA 92093, USA
| | - Kunal Bhutani
- Department of Computer Science and Engineering, University of California at San Diego, 9500 Gilman Drive, La Jolla, CA 92093, USA
| | - Jeong Ho Lee
- Institute for Genomic Medicine, Rady Children's Hospital, University of California at San Diego, 9500 Gilman Drive, La Jolla, CA 92093, USA
- Graduate School of Medical Science and Engineering, KAIST, 291 Daehak-ro, Yuseong-gu, Daejeon 305-701, Republic of Korea
| | - Anand Patel
- Department of Computer Science and Engineering, University of California at San Diego, 9500 Gilman Drive, La Jolla, CA 92093, USA
| | - Eric Scott
- Institute for Genomic Medicine, Rady Children's Hospital, University of California at San Diego, 9500 Gilman Drive, La Jolla, CA 92093, USA
| | - Hojung Nam
- School of Information and Communications, Gwangju Institute of Science and Technology, 123 Cheomdangwagi-ro, Buk-gu, Gwangju, 500-712, Republic of Korea
| | - Hayan Lee
- Department of Computer Science, Stony Brook University, 100 Nicolls Road, NY 11794, USA
| | - Joseph G Gleeson
- Institute for Genomic Medicine, Rady Children's Hospital, University of California at San Diego, 9500 Gilman Drive, La Jolla, CA 92093, USA
| | - Vineet Bafna
- Department of Computer Science and Engineering, University of California at San Diego, 9500 Gilman Drive, La Jolla, CA 92093, USA
| |
Collapse
|
828
|
de Miranda NFCC, Peng R, Georgiou K, Wu C, Falk Sörqvist E, Berglund M, Chen L, Gao Z, Lagerstedt K, Lisboa S, Roos F, van Wezel T, Teixeira MR, Rosenquist R, Sundström C, Enblad G, Nilsson M, Zeng Y, Kipling D, Pan-Hammarström Q. DNA repair genes are selectively mutated in diffuse large B cell lymphomas. ACTA ACUST UNITED AC 2013; 210:1729-42. [PMID: 23960188 PMCID: PMC3754869 DOI: 10.1084/jem.20122842] [Citation(s) in RCA: 82] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
DNA repair mechanisms are fundamental for B cell development, which relies on the somatic diversification of the immunoglobulin genes by V(D)J recombination, somatic hypermutation, and class switch recombination. Their failure is postulated to promote genomic instability and malignant transformation in B cells. By performing targeted sequencing of 73 key DNA repair genes in 29 B cell lymphoma samples, somatic and germline mutations were identified in various DNA repair pathways, mainly in diffuse large B cell lymphomas (DLBCLs). Mutations in mismatch repair genes (EXO1, MSH2, and MSH6) were associated with microsatellite instability, increased number of somatic insertions/deletions, and altered mutation signatures in tumors. Somatic mutations in nonhomologous end-joining (NHEJ) genes (DCLRE1C/ARTEMIS, PRKDC/DNA-PKcs, XRCC5/KU80, and XRCC6/KU70) were identified in four DLBCL tumors and cytogenetic analyses revealed that translocations involving the immunoglobulin-heavy chain locus occurred exclusively in NHEJ-mutated samples. The novel mutation targets, CHEK2 and PARP1, were further screened in expanded DLBCL cohorts, and somatic as well as novel and rare germline mutations were identified in 8 and 5% of analyzed tumors, respectively. By correlating defects in a subset of DNA damage response and repair genes with genomic instability events in tumors, we propose that these genes play a role in DLBCL lymphomagenesis.
Collapse
Affiliation(s)
- Noel F C C de Miranda
- Clinical Immunology, Department of Laboratory Medicine, Karolinska Institutet at Karolinska University Hospital, Huddinge, Sweden
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
829
|
Wang K, Kim C, Bradfield J, Guo Y, Toskala E, Otieno FG, Hou C, Thomas K, Cardinale C, Lyon GJ, Golhar R, Hakonarson H. Whole-genome DNA/RNA sequencing identifies truncating mutations in RBCK1 in a novel Mendelian disease with neuromuscular and cardiac involvement. Genome Med 2013; 5:67. [PMID: 23889995 PMCID: PMC3971341 DOI: 10.1186/gm471] [Citation(s) in RCA: 72] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2013] [Revised: 07/15/2013] [Accepted: 07/26/2013] [Indexed: 12/19/2022] Open
Abstract
Background Whole-exome sequencing has identified the causes of several Mendelian diseases by analyzing multiple unrelated cases, but it is more challenging to resolve the cause of extremely rare and suspected Mendelian diseases from individual families. We identified a family quartet with two children, both affected with a previously unreported disease, characterized by progressive muscular weakness and cardiomyopathy, with normal intelligence. During the course of the study, we identified one additional unrelated patient with a comparable phenotype. Methods We performed whole-genome sequencing (Complete Genomics platform), whole-exome sequencing (Agilent SureSelect exon capture and Illumina Genome Analyzer II platform), SNP genotyping (Illumina HumanHap550 SNP array) and Sanger sequencing on blood samples, as well as RNA-Seq (Illumina HiSeq platform) on transformed lymphoblastoid cell lines. Results From whole-genome sequence data, we identified RBCK1, a gene encoding an E3 ubiquitin-protein ligase, as the most likely candidate gene, with two protein-truncating mutations in probands in the first family. However, exome data failed to nominate RBCK1 as a candidate gene, due to poor regional coverage. Sanger sequencing identified a private homozygous splice variant in RBCK1 in the proband in the second family, yet SNP genotyping revealed a 1.2Mb copy-neutral region of homozygosity covering RBCK1. RNA-Seq confirmed aberrant splicing of RBCK1 transcripts, resulting in truncated protein products. Conclusions While the exact mechanism by which these mutations cause disease is unknown, our study represents an example of how the combined use of whole-genome DNA and RNA sequencing can identify a disease-predisposing gene for a novel and extremely rare Mendelian disease.
Collapse
Affiliation(s)
- Kai Wang
- Zilkha Neurogenetic Institute, Keck School of Medicine, University of Southern California, 1501 San Pablo St, Los Angeles, CA 90089, USA ; Center for Applied Genomics, Children's Hospital of Philadelphia, 3615 Civic Center Blvd, Philadelphia, PA 19104, USA
| | - Cecilia Kim
- Center for Applied Genomics, Children's Hospital of Philadelphia, 3615 Civic Center Blvd, Philadelphia, PA 19104, USA
| | - Jonathan Bradfield
- Center for Applied Genomics, Children's Hospital of Philadelphia, 3615 Civic Center Blvd, Philadelphia, PA 19104, USA
| | - Yunfei Guo
- Zilkha Neurogenetic Institute, Keck School of Medicine, University of Southern California, 1501 San Pablo St, Los Angeles, CA 90089, USA
| | - Elina Toskala
- Center for Applied Genomics, Children's Hospital of Philadelphia, 3615 Civic Center Blvd, Philadelphia, PA 19104, USA
| | - Frederick G Otieno
- Center for Applied Genomics, Children's Hospital of Philadelphia, 3615 Civic Center Blvd, Philadelphia, PA 19104, USA
| | - Cuiping Hou
- Center for Applied Genomics, Children's Hospital of Philadelphia, 3615 Civic Center Blvd, Philadelphia, PA 19104, USA
| | - Kelly Thomas
- Center for Applied Genomics, Children's Hospital of Philadelphia, 3615 Civic Center Blvd, Philadelphia, PA 19104, USA
| | - Christopher Cardinale
- Center for Applied Genomics, Children's Hospital of Philadelphia, 3615 Civic Center Blvd, Philadelphia, PA 19104, USA
| | - Gholson J Lyon
- Center for Applied Genomics, Children's Hospital of Philadelphia, 3615 Civic Center Blvd, Philadelphia, PA 19104, USA ; Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, One Bungtown Rd, NY 11724, USA
| | - Ryan Golhar
- Center for Applied Genomics, Children's Hospital of Philadelphia, 3615 Civic Center Blvd, Philadelphia, PA 19104, USA
| | - Hakon Hakonarson
- Center for Applied Genomics, Children's Hospital of Philadelphia, 3615 Civic Center Blvd, Philadelphia, PA 19104, USA ; Department of Pediatrics, University of Pennsylvania School of Medicine, 3451 Walnut St, Philadelphia, PA 19104, USA
| |
Collapse
|
830
|
Pavlopoulos GA, Oulas A, Iacucci E, Sifrim A, Moreau Y, Schneider R, Aerts J, Iliopoulos I. Unraveling genomic variation from next generation sequencing data. BioData Min 2013; 6:13. [PMID: 23885890 PMCID: PMC3726446 DOI: 10.1186/1756-0381-6-13] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2013] [Accepted: 07/18/2013] [Indexed: 12/29/2022] Open
Abstract
Elucidating the content of a DNA sequence is critical to deeper understand and decode the genetic information for any biological system. As next generation sequencing (NGS) techniques have become cheaper and more advanced in throughput over time, great innovations and breakthrough conclusions have been generated in various biological areas. Few of these areas, which get shaped by the new technological advances, involve evolution of species, microbial mapping, population genetics, genome-wide association studies (GWAs), comparative genomics, variant analysis, gene expression, gene regulation, epigenetics and personalized medicine. While NGS techniques stand as key players in modern biological research, the analysis and the interpretation of the vast amount of data that gets produced is a not an easy or a trivial task and still remains a great challenge in the field of bioinformatics. Therefore, efficient tools to cope with information overload, tackle the high complexity and provide meaningful visualizations to make the knowledge extraction easier are essential. In this article, we briefly refer to the sequencing methodologies and the available equipment to serve these analyses and we describe the data formats of the files which get produced by them. We conclude with a thorough review of tools developed to efficiently store, analyze and visualize such data with emphasis in structural variation analysis and comparative genomics. We finally comment on their functionality, strengths and weaknesses and we discuss how future applications could further develop in this field.
Collapse
Affiliation(s)
- Georgios A Pavlopoulos
- Division of Basic Sciences, University of Crete Medical School, Heraklion 71110, Greece.
| | | | | | | | | | | | | | | |
Collapse
|
831
|
McElroy K, Zagordi O, Bull R, Luciani F, Beerenwinkel N. Accurate single nucleotide variant detection in viral populations by combining probabilistic clustering with a statistical test of strand bias. BMC Genomics 2013; 14:501. [PMID: 23879730 PMCID: PMC3848937 DOI: 10.1186/1471-2164-14-501] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2012] [Accepted: 07/15/2013] [Indexed: 11/10/2022] Open
Abstract
Background Deep sequencing is a powerful tool for assessing viral genetic diversity. Such experiments harness the high coverage afforded by next generation sequencing protocols by treating sequencing reads as a population sample. Distinguishing true single nucleotide variants (SNVs) from sequencing errors remains challenging, however. Current protocols are characterised by high false positive rates, with results requiring time consuming manual checking. Results By statistical modelling, we show that if multiple variant sites are considered at once, SNVs can be called reliably from high coverage viral deep sequencing data at frequencies lower than the error rate of the sequencing technology, and that SNV calling accuracy increases as true sequence diversity within a read length increases. We demonstrate these findings on two control data sets, showing that SNV detection is more reliable on a high diversity human immunodeficiency virus sample as compared to a moderate diversity sample of hepatitis C virus. Finally, we show that in situations where probabilistic clustering retains false positive SNVs (for instance due to insufficient sample diversity or systematic errors), applying a strand bias test based on a beta-binomial model of forward read distribution can improve precision, with negligible cost to true positive recall. Conclusions By combining probabilistic clustering (implemented in the program ShoRAH) with a statistical test of strand bias, SNVs may be called from deeply sequenced viral populations with high accuracy.
Collapse
Affiliation(s)
- Kerensa McElroy
- Centre for Marine Bioinnovation and School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW, Australia.
| | | | | | | | | |
Collapse
|
832
|
Bian J, Liu C, Wang H, Xing J, Kachroo P, Zhou X. SNVHMM: predicting single nucleotide variants from next generation sequencing. BMC Bioinformatics 2013; 14:225. [PMID: 23855743 PMCID: PMC3718670 DOI: 10.1186/1471-2105-14-225] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2013] [Accepted: 07/03/2013] [Indexed: 02/04/2023] Open
Abstract
Background The rapid development of next generation sequencing (NGS) technology provides a novel avenue for genomic exploration and research. Single nucleotide variants (SNVs) inferred from next generation sequencing are expected to reveal gene mutations in cancer. However, NGS has lower sequence coverage and poor SNVs detection capability in the regulatory regions of the genome. Post probabilistic based methods are efficient for detection of SNVs in high coverage regions or sequencing data with high depth. However, for data with low sequencing depth, the efficiency of such algorithms remains poor and needs to be improved. Results A new tool SNVHMM basing on a discrete hidden Markov model (HMM) was developed to infer the genotype for each position on the genome. We incorporated the mapping quality of each read and the corresponding base quality on the reads into the emission probability of HMM. The context information of the whole observation as well as its confidence were completely utilized to infer the genotype for each position on the genome in study. Therefore, more probability power can be gained over the Bayes based methods, which is very useful for SNVs detection for data with low sequencing depth. Moreover, our model was verified by testing against two sets of lobular breast tumor and Myelodysplastic Syndromes (MDS) data each. Comparing against a recently published SNVs calling algorithm SNVMix2, our model improved the performance of SNVMix2 largely when the sequencing depth is low and also outperformed SNVMix2 when SNVMix2 is well trained by large datasets. Conclusions SNVHMM can detect SNVs from NGS cancer data efficiently even if the sequence depth is very low. The training data size can be very small for SNVHMM to work. SNVHMM incorporated the base quality and mapping quality of all observed bases and reads, and also provides the option for users to choose the confidence of the observation for SNVs prediction.
Collapse
|
833
|
Badouin H, Belkhir K, Gregson E, Galindo J, Sundström L, Martin SJ, Butlin RK, Smadja CM. Transcriptome characterisation of the ant Formica exsecta with new insights into the evolution of desaturase genes in social hymenoptera. PLoS One 2013; 8:e68200. [PMID: 23874539 PMCID: PMC3709892 DOI: 10.1371/journal.pone.0068200] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2013] [Accepted: 05/28/2013] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND Despite the recent sequencing of seven ant genomes, no genomic data are available for the genus Formica, an important group for the study of eusocial traits. We sequenced the transcriptome of the ant Formica exsecta with the 454 FLX Titanium technology from a pooled sample of workers from 70 Finnish colonies. RESULTS About 1,000,000 reads were obtained from a normalised cDNA library. We compared the assemblers MIRA3.0 and Newbler2.6 and showed that the latter performed better on this dataset due to a new option which is dedicated to improve contig formation in low depth portions of the assemblies. The 29,579 contigs represent 27 Mb. 50% showed similarity with known proteins and 25% could be assigned a category of gene ontology. We found more than 13,000 high-quality single nucleotide polymorphisms. The Δ9 desaturase gene family is an important multigene family involved in chemical communication in insects. We found six Δ9 desaturases in this Formica exsecta transcriptome dataset that were used to reconstruct a maximum-likelihood phylogeny of insect desaturases and to test for signatures of positive selection in this multigene family in ant lineages. We found differences with previous phylogenies of this gene family in ants, and found two clades potentially under positive selection. CONCLUSION This first transcriptome reference sequence of Formica exsecta provided sequence and polymorphism data that will allow researchers working on Formica ants to develop studies to tackle the genetic basis of eusocial phenotypes. In addition, this study provided some general guidelines for de novo transcriptome assembly that should be useful for future transcriptome sequencing projects. Finally, we found potential signatures of positive selection in some clades of the Δ9 desaturase gene family in ants, which suggest the potential role of sequence divergence and adaptive evolution in shaping the large diversity of chemical cues in social insects.
Collapse
Affiliation(s)
- Hélène Badouin
- Centre National de la Recherche Scientifique CNRS - Institut des Sciences de l'Evolution UMR 5554, Université Montpellier 2, Montpellier, France.
| | | | | | | | | | | | | | | |
Collapse
|
834
|
Roberts ND, Kortschak RD, Parker WT, Schreiber AW, Branford S, Scott HS, Glonek G, Adelson DL. A comparative analysis of algorithms for somatic SNV detection in cancer. ACTA ACUST UNITED AC 2013; 29:2223-30. [PMID: 23842810 PMCID: PMC3753564 DOI: 10.1093/bioinformatics/btt375] [Citation(s) in RCA: 69] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Motivation: With the advent of relatively affordable high-throughput technologies, DNA sequencing of cancers is now common practice in cancer research projects and will be increasingly used in clinical practice to inform diagnosis and treatment. Somatic (cancer-only) single nucleotide variants (SNVs) are the simplest class of mutation, yet their identification in DNA sequencing data is confounded by germline polymorphisms, tumour heterogeneity and sequencing and analysis errors. Four recently published algorithms for the detection of somatic SNV sites in matched cancer–normal sequencing datasets are VarScan, SomaticSniper, JointSNVMix and Strelka. In this analysis, we apply these four SNV calling algorithms to cancer–normal Illumina exome sequencing of a chronic myeloid leukaemia (CML) patient. The candidate SNV sites returned by each algorithm are filtered to remove likely false positives, then characterized and compared to investigate the strengths and weaknesses of each SNV calling algorithm. Results: Comparing the candidate SNV sets returned by VarScan, SomaticSniper, JointSNVMix2 and Strelka revealed substantial differences with respect to the number and character of sites returned; the somatic probability scores assigned to the same sites; their susceptibility to various sources of noise; and their sensitivities to low-allelic-fraction candidates. Availability: Data accession number SRA081939, code at http://code.google.com/p/snv-caller-review/ Contact:david.adelson@adelaide.edu.au Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Nicola D Roberts
- School of Molecular and Biomedical Science and School of Mathematical Sciences, University of Adelaide, South Australia, Australia
| | | | | | | | | | | | | | | |
Collapse
|
835
|
Nocq J, Celton M, Gendron P, Lemieux S, Wilhelm BT. Harnessing virtual machines to simplify next-generation DNA sequencing analysis. Bioinformatics 2013; 29:2075-83. [PMID: 23786767 DOI: 10.1093/bioinformatics/btt352] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
MOTIVATION The growth of next-generation sequencing (NGS) has not only dramatically accelerated the pace of research in the field of genomics, but it has also opened the door to personalized medicine and diagnostics. The resulting flood of data has led to the rapid development of large numbers of bioinformatic tools for data analysis, creating a challenging situation for researchers when choosing and configuring a variety of software for their analysis, and for other researchers trying to replicate their analysis. As NGS technology continues to expand from the research environment into clinical laboratories, the challenges associated with data analysis have the potential to slow the adoption of this technology. RESULTS Here we discuss the potential of virtual machines (VMs) to be used as a method for sharing entire installations of NGS software (bioinformatic 'pipelines'). VMs are created by programs designed to allow multiple operating systems to co-exist on a single physical machine, and they can be made following the object-oriented paradigm of encapsulating data and methods together. This allows NGS data to be distributed within a VM, along with the pre-configured software for its analysis. Although VMs have historically suffered from poor performance relative to native operating systems, we present benchmarking results demonstrating that this reduced performance can now be minimized. We further discuss the many potential benefits of VMs as a solution for NGS analysis and describe several published examples. Lastly, we consider the benefits of VMs in facilitating the introduction of NGS technology into the clinical environment. CONTACT brian.wilhelm@umontreal.ca.
Collapse
Affiliation(s)
- Julie Nocq
- Institute for Research in Immunology and Cancer, Laboratory for High-Throughput Genomics, Department of Medicine, University of Montreal, QC, Canada
| | | | | | | | | |
Collapse
|
836
|
Novel form of X-linked nonsyndromic hearing loss with cochlear malformation caused by a mutation in the type IV collagen gene COL4A6. Eur J Hum Genet 2013; 22:208-15. [PMID: 23714752 DOI: 10.1038/ejhg.2013.108] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2012] [Revised: 04/08/2013] [Accepted: 04/19/2013] [Indexed: 12/16/2022] Open
Abstract
Hereditary hearing loss is the most common human sensorineural disorder. Genetic causes are highly heterogeneous, with mutations detected in >40 genes associated with nonsyndromic hearing loss, to date. Whereas autosomal recessive and autosomal dominant inheritance is prevalent, X-linked forms of nonsyndromic hearing impairment are extremely rare. Here, we present a Hungarian three-generation family with X-linked nonsyndromic congenital hearing loss and the underlying genetic defect. Next-generation sequencing and subsequent segregation analysis detected a missense mutation (c.1771G>A, p.Gly591Ser) in the type IV collagen gene COL4A6 in all affected family members. Bioinformatic analysis and expression studies support this substitution as being causative. COL4A6 encodes the alpha-6 chain of type IV collagen of basal membranes, which forms a heterotrimer with two alpha-5 chains encoded by COL4A5. Whereas mutations in COL4A5 and contiguous X-chromosomal deletions involving COL4A5 and COL4A6 are associated with X-linked Alport syndrome, a nephropathy associated with deafness and cataract, mutations in COL4A6 alone have not been related to any hereditary disease so far. Moreover, our index patient and other affected family members show normal renal and ocular function, which is not consistent with Alport syndrome, but with a nonsyndromic type of hearing loss. In situ hybridization and immunostaining demonstrated expression of the COL4A6 homologs in the otic vesicle of the zebrafish and in the murine inner ear, supporting its role in normal ear development and function. In conclusion, our results suggest COL4A6 as being the fourth gene associated with X-linked nonsyndromic hearing loss.
Collapse
|
837
|
Víquez-Zamora M, Vosman B, van de Geest H, Bovy A, Visser RGF, Finkers R, van Heusden AW. Tomato breeding in the genomics era: insights from a SNP array. BMC Genomics 2013; 14:354. [PMID: 23711327 PMCID: PMC3680325 DOI: 10.1186/1471-2164-14-354] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2013] [Accepted: 05/20/2013] [Indexed: 02/01/2023] Open
Abstract
BACKGROUND The major bottle neck in genetic and linkage studies in tomato has been the lack of a sufficient number of molecular markers. This has radically changed with the application of next generation sequencing and high throughput genotyping. A set of 6000 SNPs was identified and 5528 of them were used to evaluate tomato germplasm at the level of species, varieties and segregating populations. RESULTS From the 5528 SNPs, 1980 originated from 454-sequencing, 3495 from Illumina Solexa sequencing and 53 were additional known markers. Genotyping different tomato samples allowed the evaluation of the level of heterozygosity and introgressions among commercial varieties. Cherry tomatoes were especially different from round/beefs in chromosomes 4, 5 and 12. We were able to identify a set of 750 unique markers distinguishing S. lycopersicum 'Moneymaker' from all its distantly related wild relatives. Clustering and neighbour joining analysis among varieties and species showed expected grouping patterns, with S. pimpinellifolium as the most closely related to commercial tomatoes earlier results. CONCLUSIONS Our results show that a SNP search in only a few breeding lines already provides generally applicable markers in tomato and its wild relatives. It also shows that the Illumina bead array generated data are highly reproducible. Our SNPs can roughly be divided in two categories: SNPs of which both forms are present in the wild relatives and in domesticated tomatoes (originating from common ancestors) and SNPs unique for the domesticated tomato (originating from after the domestication event). The SNPs can be used for genotyping, identification of varieties, comparison of genetic and physical linkage maps and to confirm (phylogenetic) relations. In the SNPs used for the array there is hardly any overlap with the SolCAP array and it is strongly recommended to combine both SNP sets and to select a core collection of robust SNPs completely covering the entire tomato genome.
Collapse
Affiliation(s)
- Marcela Víquez-Zamora
- Wageningen UR Plant Breeding, P.O. Box 16, AJ, Wageningen, 6700, The Netherlands
- Centre for Biosystems Genomics, P.O. Box 98, AB, Wageningen, 6700, The Netherlands
- Graduate School Experimental Plant Sciences, Wageningen Campus, PB Wageningen, 6807, The Netherlands
| | - Ben Vosman
- Wageningen UR Plant Breeding, P.O. Box 16, AJ, Wageningen, 6700, The Netherlands
- Centre for Biosystems Genomics, P.O. Box 98, AB, Wageningen, 6700, The Netherlands
| | - Henri van de Geest
- Centre for Biosystems Genomics, P.O. Box 98, AB, Wageningen, 6700, The Netherlands
- Bioscience, Plant Research International, P.O. Box 619, AP Wageningen, 6700, The Netherlands
| | - Arnaud Bovy
- Wageningen UR Plant Breeding, P.O. Box 16, AJ, Wageningen, 6700, The Netherlands
- Centre for Biosystems Genomics, P.O. Box 98, AB, Wageningen, 6700, The Netherlands
| | - Richard GF Visser
- Wageningen UR Plant Breeding, P.O. Box 16, AJ, Wageningen, 6700, The Netherlands
- Centre for Biosystems Genomics, P.O. Box 98, AB, Wageningen, 6700, The Netherlands
| | - Richard Finkers
- Wageningen UR Plant Breeding, P.O. Box 16, AJ, Wageningen, 6700, The Netherlands
- Centre for Biosystems Genomics, P.O. Box 98, AB, Wageningen, 6700, The Netherlands
| | - Adriaan W van Heusden
- Wageningen UR Plant Breeding, P.O. Box 16, AJ, Wageningen, 6700, The Netherlands
- Centre for Biosystems Genomics, P.O. Box 98, AB, Wageningen, 6700, The Netherlands
| |
Collapse
|
838
|
Warshauer DH, Lin D, Hari K, Jain R, Davis C, Larue B, King JL, Budowle B. STRait Razor: a length-based forensic STR allele-calling tool for use with second generation sequencing data. Forensic Sci Int Genet 2013; 7:409-17. [PMID: 23768312 DOI: 10.1016/j.fsigen.2013.04.005] [Citation(s) in RCA: 89] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2013] [Revised: 04/04/2013] [Accepted: 04/15/2013] [Indexed: 12/31/2022]
Abstract
Recent studies have demonstrated the capability of second generation sequencing (SGS) to provide coverage of short tandem repeats (STRs) found within the human genome. However, there are relatively few bioinformatic software packages capable of detecting these markers in the raw sequence data. The extant STR-calling tools are sophisticated, but are not always applicable to the analysis of the STR loci commonly used in forensic analyses. STRait Razor is a newly developed Perl-based software tool that runs on the Linux/Unix operating system and is designed to detect forensically-relevant STR alleles in FASTQ sequence data, based on allelic length. It is capable of analyzing STR loci with repeat motifs ranging from simple to complex without the need for extensive allelic sequence data. STRait Razor is designed to interpret both single-end and paired-end data and relies on intelligent parallel processing to reduce analysis time. Users are presented with a number of customization options, including variable mismatch detection parameters, as well as the ability to easily allow for the detection of alleles at new loci. In its current state, the software detects alleles for 44 autosomal and Y-chromosome STR loci. The study described herein demonstrates that STRait Razor is capable of detecting STR alleles in data generated by multiple library preparation methods and two Illumina(®) sequencing instruments, with 100% concordance. The data also reveal noteworthy concepts related to the effect of different preparation chemistries and sequencing parameters on the bioinformatic detection of STR alleles.
Collapse
Affiliation(s)
- David H Warshauer
- Institute of Applied Genetics, Department of Forensic and Investigative Genetics, University of North Texas Health Science Center, 3500 Camp Bowie Boulevard, Fort Worth, TX 76107, USA
| | | | | | | | | | | | | | | |
Collapse
|
839
|
Simons C, Wolf N, McNeil N, Caldovic L, Devaney J, Takanohashi A, Crawford J, Ru K, Grimmond S, Miller D, Tonduti D, Schmidt J, Chudnow R, van Coster R, Lagae L, Kisler J, Sperner J, van der Knaap M, Schiffmann R, Taft R, Vanderver A. A de novo mutation in the β-tubulin gene TUBB4A results in the leukoencephalopathy hypomyelination with atrophy of the basal ganglia and cerebellum. Am J Hum Genet 2013; 92:767-73. [PMID: 23582646 DOI: 10.1016/j.ajhg.2013.03.018] [Citation(s) in RCA: 141] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2013] [Revised: 02/12/2013] [Accepted: 03/19/2013] [Indexed: 01/06/2023] Open
Abstract
Hypomyelination with atrophy of the basal ganglia and cerebellum (H-ABC) is a rare hereditary leukoencephalopathy that was originally identified by MRI pattern analysis, and it has thus far defied all attempts at identifying the causal mutation. Only 22 cases are published in the literature to date. We performed exome sequencing on five family trios, two family quartets, and three single probands, which revealed that all eleven H-ABC-diagnosed individuals carry the same de novo single-nucleotide TUBB4A mutation resulting in nonsynonymous change p.Asp249Asn. Detailed investigation of one of the family quartets with the singular finding of an H-ABC-affected sibling pair revealed maternal mosaicism for the mutation, suggesting that rare de novo mutations that are initially phenotypically neutral in a mosaic individual can be disease causing in the subsequent generation. Modeling of TUBB4A shows that the mutation creates a nonsynonymous change at a highly conserved asparagine that sits at the intradimer interface of α-tubulin and β-tubulin, and this change might affect tubulin dimerization, microtubule polymerization, or microtubule stability. Consistent with H-ABC's clinical presentation, TUBB4A is highly expressed in neurons, and a recent report has shown that an N-terminal alteration is associated with a heritable dystonia. Together, these data demonstrate that a single de novo mutation in TUBB4A results in H-ABC.
Collapse
|
840
|
Solomon O, Oren S, Safran M, Deshet-Unger N, Akiva P, Jacob-Hirsch J, Cesarkas K, Kabesa R, Amariglio N, Unger R, Rechavi G, Eyal E. Global regulation of alternative splicing by adenosine deaminase acting on RNA (ADAR). RNA (NEW YORK, N.Y.) 2013; 19:591-604. [PMID: 23474544 PMCID: PMC3677275 DOI: 10.1261/rna.038042.112] [Citation(s) in RCA: 95] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
Alternative mRNA splicing is a major mechanism for gene regulation and transcriptome diversity. Despite the extent of the phenomenon, the regulation and specificity of the splicing machinery are only partially understood. Adenosine-to-inosine (A-to-I) RNA editing of pre-mRNA by ADAR enzymes has been linked to splicing regulation in several cases. Here we used bioinformatics approaches, RNA-seq and exon-specific microarray of ADAR knockdown cells to globally examine how ADAR and its A-to-I RNA editing activity influence alternative mRNA splicing. Although A-to-I RNA editing only rarely targets canonical splicing acceptor, donor, and branch sites, it was found to affect splicing regulatory elements (SREs) within exons. Cassette exons were found to be significantly enriched with A-to-I RNA editing sites compared with constitutive exons. RNA-seq and exon-specific microarray revealed that ADAR knockdown in hepatocarcinoma and myelogenous leukemia cell lines leads to global changes in gene expression, with hundreds of genes changing their splicing patterns in both cell lines. This global change in splicing pattern cannot be explained by putative editing sites alone. Genes showing significant changes in their splicing pattern are frequently involved in RNA processing and splicing activity. Analysis of recently published RNA-seq data from glioblastoma cell lines showed similar results. Our global analysis reveals that ADAR plays a major role in splicing regulation. Although direct editing of the splicing motifs does occur, we suggest it is not likely to be the primary mechanism for ADAR-mediated regulation of alternative splicing. Rather, this regulation is achieved by modulating trans-acting factors involved in the splicing machinery.
Collapse
Affiliation(s)
- Oz Solomon
- Cancer Research Center, Chaim Sheba Medical Center, Tel Hashomer 52621, Ramat Gan, Israel
- The Everard & Mina Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan 52900, Israel
| | - Shirley Oren
- Cancer Research Center, Chaim Sheba Medical Center, Tel Hashomer 52621, Ramat Gan, Israel
- Sackler School of Medicine, Tel Aviv University, Tel Aviv 69978, Israel
| | - Michal Safran
- Cancer Research Center, Chaim Sheba Medical Center, Tel Hashomer 52621, Ramat Gan, Israel
| | - Naamit Deshet-Unger
- Cancer Research Center, Chaim Sheba Medical Center, Tel Hashomer 52621, Ramat Gan, Israel
| | - Pinchas Akiva
- Cancer Research Center, Chaim Sheba Medical Center, Tel Hashomer 52621, Ramat Gan, Israel
- Sackler School of Medicine, Tel Aviv University, Tel Aviv 69978, Israel
| | - Jasmine Jacob-Hirsch
- Cancer Research Center, Chaim Sheba Medical Center, Tel Hashomer 52621, Ramat Gan, Israel
| | - Karen Cesarkas
- Cancer Research Center, Chaim Sheba Medical Center, Tel Hashomer 52621, Ramat Gan, Israel
| | - Reut Kabesa
- The Everard & Mina Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan 52900, Israel
| | - Ninette Amariglio
- Cancer Research Center, Chaim Sheba Medical Center, Tel Hashomer 52621, Ramat Gan, Israel
| | - Ron Unger
- The Everard & Mina Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan 52900, Israel
| | - Gideon Rechavi
- Cancer Research Center, Chaim Sheba Medical Center, Tel Hashomer 52621, Ramat Gan, Israel
- Sackler School of Medicine, Tel Aviv University, Tel Aviv 69978, Israel
| | - Eran Eyal
- Cancer Research Center, Chaim Sheba Medical Center, Tel Hashomer 52621, Ramat Gan, Israel
- Corresponding authorE-mail
| |
Collapse
|
841
|
Nijveen H, van Kaauwen M, Esselink DG, Hoegen B, Vosman B. QualitySNPng: a user-friendly SNP detection and visualization tool. Nucleic Acids Res 2013; 41:W587-90. [PMID: 23632165 PMCID: PMC3692117 DOI: 10.1093/nar/gkt333] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
QualitySNPng is a new software tool for the detection and interactive visualization of single-nucleotide polymorphisms (SNPs). It uses a haplotype-based strategy to identify reliable SNPs; it is optimized for the analysis of current RNA-seq data; but it can also be used on genomic DNA sequences derived from next-generation sequencing experiments. QualitySNPng does not require a sequenced reference genome and delivers reliable SNPs for di- as well as polyploid species. The tool features a user-friendly interface, multiple filtering options to handle typical sequencing errors, support for SAM and ACE files and interactive visualization. QualitySNPng produces high-quality SNP information that can be used directly in genotyping by sequencing approaches for application in QTL and genome-wide association mapping as well as to populate SNP arrays. The software can be used as a stand-alone application with a graphical user interface or as part of a pipeline system like Galaxy. Versions for Windows, Mac OS X and Linux, as well as the source code, are available from http://www.bioinformatics.nl/QualitySNPng.
Collapse
Affiliation(s)
- Harm Nijveen
- Department of Plant Sciences, Laboratory of Bioinformatics, Wageningen University, PO Box 569, 6700AN Wageningen, The Netherlands.
| | | | | | | | | |
Collapse
|
842
|
Identification of direct targets and modified bases of RNA cytosine methyltransferases. Nat Biotechnol 2013; 31:458-64. [PMID: 23604283 PMCID: PMC3791587 DOI: 10.1038/nbt.2566] [Citation(s) in RCA: 359] [Impact Index Per Article: 29.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2013] [Accepted: 04/02/2013] [Indexed: 12/24/2022]
Abstract
The extent and biological impact of RNA cytosine methylation are poorly understood, in part owing to limitations of current techniques for determining the targets of RNA methyltransferases. Here we describe 5-azacytidine-mediated RNA immunoprecipitation (Aza-IP), a mechanism-based technique that exploits the covalent bond formed between an RNA methyltransferase and the cytidine analog 5-azacytidine to recover RNA targets by immunoprecipitation. Targets are subsequently identified by high-throughput sequencing. When applied in a human cell line to the RNA methyltransferases DNMT2 and NSUN2, Aza-IP enabled >200-fold enrichment of tRNAs that are known targets of the enzymes. In addition, it revealed many tRNA and non-coding RNA targets not previously associated with NSUN2. Notably, we observed a high frequency of C>G transversions at the cytosine residues targeted by both enzymes, allowing identification of the specific methylated cytosine(s) in target RNAs. Given the mechanistic similarity of cytosine RNA methyltransferases, Aza-IP may be generally applicable for target identification.
Collapse
|
843
|
Bareke E, Saillour V, Spinella JF, Vidal R, Healy J, Sinnett D, Csűrös M. Joint genotype inference with germline and somatic mutations. BMC Bioinformatics 2013; 14 Suppl 5:S3. [PMID: 23734724 PMCID: PMC3622648 DOI: 10.1186/1471-2105-14-s5-s3] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
The joint sequencing of related genomes has become an important means to discover rare variants. Normal-tumor genome pairs are routinely sequenced together to find somatic mutations and their associations with different cancers. Parental and sibling genomes reveal de novo germline mutations and inheritance patterns related to Mendelian diseases.Acute lymphoblastic leukemia (ALL) is the most common paediatric cancer and the leading cause of cancer-related death among children. With the aim of uncovering the full spectrum of germline and somatic genetic alterations in childhood ALL genomes, we conducted whole-exome re-sequencing on a unique cohort of over 120 exomes of childhood ALL quartets, each comprising a patient's tumor and matched-normal material, and DNA from both parents. We developed a general probabilistic model for such quartet sequencing reads mapped to the reference human genome. The model is used to infer joint genotypes at homologous loci across a normal-tumor genome pair and two parental genomes.We describe the algorithms and data structures for genotype inference, model parameter training. We implemented the methods in an open-source software package (QUADGT) that uses the standard file formats of the 1000 Genomes Project. Our method's utility is illustrated on quartets from the ALL cohort.
Collapse
Affiliation(s)
- Eric Bareke
- Department of Computer Science and Operations Research, University of Montréal, QC, Canada
| | | | | | | | | | | | | |
Collapse
|
844
|
Nota B, Struys E, Pop A, Jansen E, Fernandez Ojeda M, Kanhai W, Kranendijk M, van Dooren S, Bevova M, Sistermans E, Nieuwint A, Barth M, Ben-Omran T, Hoffmann G, de Lonlay P, McDonald M, Meberg A, Muntau A, Nuoffer JM, Parini R, Read MH, Renneberg A, Santer R, Strahleck T, van Schaftingen E, van der Knaap M, Jakobs C, Salomons G. Deficiency in SLC25A1, encoding the mitochondrial citrate carrier, causes combined D-2- and L-2-hydroxyglutaric aciduria. Am J Hum Genet 2013; 92:627-31. [PMID: 23561848 DOI: 10.1016/j.ajhg.2013.03.009] [Citation(s) in RCA: 112] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2012] [Revised: 01/02/2013] [Accepted: 03/13/2013] [Indexed: 12/17/2022] Open
Abstract
The Krebs cycle is of fundamental importance for the generation of the energetic and molecular needs of both prokaryotic and eukaryotic cells. Both enantiomers of metabolite 2-hydroxyglutarate are directly linked to this pivotal biochemical pathway and are found elevated not only in several cancers, but also in different variants of the neurometabolic disease 2-hydroxyglutaric aciduria. Recently we showed that cancer-associated IDH2 germline mutations cause one variant of 2-hydroxyglutaric aciduria. Complementary to these findings, we now report recessive mutations in SLC25A1, the mitochondrial citrate carrier, in 12 out of 12 individuals with combined D-2- and L-2-hydroxyglutaric aciduria. Impaired mitochondrial citrate efflux, demonstrated by stable isotope labeling experiments and the absence of SLC25A1 in fibroblasts harboring certain mutations, suggest that SLC25A1 deficiency is pathogenic. Our results identify defects in SLC25A1 as a cause of combined D-2- and L-2-hydroxyglutaric aciduria.
Collapse
|
845
|
|
846
|
Bettencourt C, López-Sendón JL, García-Caldentey J, Rizzu P, Bakker IMC, Shomroni O, Quintáns B, Dávila JR, Bevova MR, Sobrido MJ, Heutink P, de Yébenes JG. Exome sequencing is a useful diagnostic tool for complicated forms of hereditary spastic paraplegia. Clin Genet 2013; 85:154-8. [PMID: 23438842 DOI: 10.1111/cge.12133] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2012] [Revised: 02/20/2013] [Accepted: 02/20/2013] [Indexed: 01/02/2023]
Abstract
Hereditary spastic paraplegias constitute a heterogeneous group of neurodegenerative diseases encompassing pure and complicated forms, for which at least 52 loci and 31 causative genes have been identified. Although mutations in the SPAST gene explain approximately 40% of the pure autosomal dominant forms, molecular diagnosis can be challenging for the sporadic and recessive forms, which are often complicated and clinically overlap with a broad number of movement disorders. The validity of exome sequencing as a routine diagnostic approach in the movement disorder clinic needs to be assessed. The main goal of this study was to explore the usefulness of an exome analysis for the diagnosis of a complicated form of spastic paraplegia. Whole-exome sequencing was performed in two Spanish siblings with a neurodegenerative syndrome including upper and lower motor neuron, ocular and cerebellar signs. Exome sequencing revealed that both patients carry a novel homozygous nonsense mutation in exon 15 of the SPG11 gene (c.2678G>A; p.W893X), which was not found in 584 Spanish control chromosomes. After many years of follow-up and multiple time-consuming genetic testing, we were able to diagnose these patients by making use of whole-exome sequencing, showing that this is a cost-efficient diagnostic tool for the movement disorder specialist.
Collapse
Affiliation(s)
- C Bettencourt
- Institute for Molecular and Cell Biology (IBMC), University of Porto, Porto, Portugal; Center of Research in Natural Resources (CIRN) and Department of Biology, University of the Azores, Ponta Delgada, Portugal
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
847
|
Kevelam SH, Bugiani M, Salomons GS, Feigenbaum A, Blaser S, Prasad C, Häberle J, Barić I, Bakker IMC, Postma NL, Kanhai WA, Wolf NI, Abbink TEM, Waisfisz Q, Heutink P, van der Knaap MS. Exome sequencing reveals mutated SLC19A3 in patients with an early-infantile, lethal encephalopathy. Brain 2013; 136:1534-43. [DOI: 10.1093/brain/awt054] [Citation(s) in RCA: 70] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
848
|
Zhu S, Dai YM, Zhang XY, Ye JR, Wang MX, Huang MR. Untangling the transcriptome from fungus-infected plant tissues. Gene 2013; 519:238-44. [PMID: 23466979 DOI: 10.1016/j.gene.2013.02.023] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2012] [Revised: 01/29/2013] [Accepted: 02/13/2013] [Indexed: 12/31/2022]
Abstract
The development of sequencing technology allows low-cost generation of sequence data. The huge amount of raw sequence data now available has introduced many challenges associated with analysis of these large-scale data banks. For example, it is very important to distinguish materials of plant and fungal origin in fungus-infected plant tissue. The origin of transcripts that were sequenced from Library 895-M6 (poplar tissue infected by Marssonina brunnea) on Illumina/Solexa GA IIx was determined by combining three methods: (1) based on the taxonomic information of homologous sequences; (2) based on the reference genome sequence; (3) based on the transcriptome sequence of the host and its pathogen obtained from Library 895 (poplar) and Library M6 (M. brunnea) as well as Library 895-M6 (mixture of poplar and M. brunnea). We idenified accurately the origin of 80,978 (99.5%) contigs in the mixed poplar and M. brunnea sample (Library 895-M6) by integrating the results from the three methods. The results of this study demonstrate that a combination of these three approaches described here is an effective strategy for determining the origin of sequences in a mixed pool, and provides a basis for further transcriptome analysis of the mixed sample.
Collapse
Affiliation(s)
- Sheng Zhu
- Jiangsu Key Laboratory for Poplar Germplasm Enhancement and Variety Improvement, Nanjing Forestry University, Nanjing 210037, China
| | | | | | | | | | | |
Collapse
|
849
|
Singhal S. De novo
transcriptomic analyses for non‐model organisms: an evaluation of methods across a multi‐species data set. Mol Ecol Resour 2013; 13:403-16. [DOI: 10.1111/1755-0998.12077] [Citation(s) in RCA: 66] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2012] [Revised: 12/13/2012] [Accepted: 12/22/2012] [Indexed: 01/09/2023]
Affiliation(s)
- Sonal Singhal
- Museum of Vertebrate Zoology University of California, Berkeley 3101 Valley Life Sciences Building Berkeley CA 94720‐3160 USA
- Department of Integrative Biology University of California, Berkeley 1005 Valley Life Sciences Building Berkeley CA 94720‐3140 USA
| |
Collapse
|
850
|
Yost SE, Pastorino S, Rozenzhak S, Smith EN, Chao YS, Jiang P, Kesari S, Frazer KA, Harismendy O. High-resolution mutational profiling suggests the genetic validity of glioblastoma patient-derived pre-clinical models. PLoS One 2013; 8:e56185. [PMID: 23441165 PMCID: PMC3575368 DOI: 10.1371/journal.pone.0056185] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2012] [Accepted: 01/07/2013] [Indexed: 11/19/2022] Open
Abstract
Recent advances in the ability to efficiently characterize tumor genomes is enabling targeted drug development, which requires rigorous biomarker-based patient selection to increase effectiveness. Consequently, representative DNA biomarkers become equally important in pre-clinical studies. However, it is still unclear how well these markers are maintained between the primary tumor and the patient-derived tumor models. Here, we report the comprehensive identification of somatic coding mutations and copy number aberrations in four glioblastoma (GBM) primary tumors and their matched pre-clinical models: serum-free neurospheres, adherent cell cultures, and mouse xenografts. We developed innovative methods to improve the data quality and allow a strict comparison of matched tumor samples. Our analysis identifies known GBM mutations altering PTEN and TP53 genes, and new actionable mutations such as the loss of PIK3R1, and reveals clear patient-to-patient differences. In contrast, for each patient, we do not observe any significant remodeling of the mutational profile between primary to model tumors and the few discrepancies can be attributed to stochastic errors or differences in sample purity. Similarly, we observe ∼96% primary-to-model concordance in copy number calls in the high-cellularity samples. In contrast to previous reports based on gene expression profiles, we do not observe significant differences at the DNA level between in vitro compared to in vivo models. This study suggests, at a remarkable resolution, the genome-wide conservation of a patient’s tumor genetics in various pre-clinical models, and therefore supports their use for the development and testing of personalized targeted therapies.
Collapse
Affiliation(s)
- Shawn E. Yost
- Bioinformatics and Systems Biology Graduate Program, University of California San Diego, La Jolla, California, United States of America
- Division of Genome Information Sciences, Department of Pediatrics and Rady Children’s Hospital, University of California San Diego, La Jolla, California, United States of America
| | - Sandra Pastorino
- Department of Neurosciences, University of California San Diego, La Jolla, California, United States of America
- Translational Neuro-oncology Laboratories, University of California San Diego, La Jolla, California, United States of America
| | - Sophie Rozenzhak
- Division of Genome Information Sciences, Department of Pediatrics and Rady Children’s Hospital, University of California San Diego, La Jolla, California, United States of America
| | - Erin N. Smith
- Division of Genome Information Sciences, Department of Pediatrics and Rady Children’s Hospital, University of California San Diego, La Jolla, California, United States of America
| | - Ying S. Chao
- Department of Neurosciences, University of California San Diego, La Jolla, California, United States of America
- Translational Neuro-oncology Laboratories, University of California San Diego, La Jolla, California, United States of America
| | - Pengfei Jiang
- Department of Neurosciences, University of California San Diego, La Jolla, California, United States of America
- Translational Neuro-oncology Laboratories, University of California San Diego, La Jolla, California, United States of America
| | - Santosh Kesari
- Department of Neurosciences, University of California San Diego, La Jolla, California, United States of America
- Translational Neuro-oncology Laboratories, University of California San Diego, La Jolla, California, United States of America
- Moores UCSD Cancer Center, University of California San Diego, La Jolla, California, United States of America
- * E-mail: (OH); (SK)
| | - Kelly A. Frazer
- Division of Genome Information Sciences, Department of Pediatrics and Rady Children’s Hospital, University of California San Diego, La Jolla, California, United States of America
- Moores UCSD Cancer Center, University of California San Diego, La Jolla, California, United States of America
- Clinical and Translational Research Institute, University of California San Diego, La Jolla, California, United States of America
- Institute for Genomic Medicine, University of California San Diego, La Jolla, California, United States of America
| | - Olivier Harismendy
- Division of Genome Information Sciences, Department of Pediatrics and Rady Children’s Hospital, University of California San Diego, La Jolla, California, United States of America
- Moores UCSD Cancer Center, University of California San Diego, La Jolla, California, United States of America
- Clinical and Translational Research Institute, University of California San Diego, La Jolla, California, United States of America
- * E-mail: (OH); (SK)
| |
Collapse
|