51
|
Ma X, Zhang B, Ma C, Ma Z. Co-regularized nonnegative matrix factorization for evolving community detection in dynamic networks. Inf Sci (N Y) 2020; 528:265-279. [DOI: 10.1016/j.ins.2020.04.031] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
|
52
|
Zhang J, Liu J, Lee D, Feng JJ, Lochovsky L, Lou S, Rutenberg-Schoenberg M, Gerstein M. RADAR: annotation and prioritization of variants in the post-transcriptional regulome of RNA-binding proteins. Genome Biol 2020; 21:151. [PMID: 32727537 PMCID: PMC7391703 DOI: 10.1186/s13059-020-01979-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2018] [Accepted: 02/28/2020] [Indexed: 12/20/2022] Open
Abstract
RNA-binding proteins (RBPs) play key roles in post-transcriptional regulation and disease. Their binding sites cover more of the genome than coding exons; nevertheless, most noncoding variant prioritization methods only focus on transcriptional regulation. Here, we integrate the portfolio of ENCODE-RBP experiments to develop RADAR, a variant-scoring framework. RADAR uses conservation, RNA structure, network centrality, and motifs to provide an overall impact score. Then, it further incorporates tissue-specific inputs to highlight disease-specific variants. Our results demonstrate RADAR can successfully pinpoint variants, both somatic and germline, associated with RBP-function dysregulation, which cannot be found by most current prioritization methods, for example, variants affecting splicing.
Collapse
Affiliation(s)
- Jing Zhang
- Department of Computer Science, University of California, Irvine, CA, 92697, USA
| | - Jason Liu
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, 06520, USA.,Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, 06520, USA
| | - Donghoon Lee
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, 06520, USA
| | - Jo-Jo Feng
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, 06520, USA
| | - Lucas Lochovsky
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, 06520, USA
| | - Shaoke Lou
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, 06520, USA.,Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, 06520, USA
| | - Michael Rutenberg-Schoenberg
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, 06520, USA.,Chemical Biology Institute, Yale University, West Haven, CT, 06516, USA
| | - Mark Gerstein
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, 06520, USA. .,Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, 06520, USA. .,Department of Computer Science, Yale University, New Haven, CT, 06520, USA.
| |
Collapse
|
53
|
Huang YF. Unified inference of missense variant effects and gene constraints in the human genome. PLoS Genet 2020; 16:e1008922. [PMID: 32667917 PMCID: PMC7384676 DOI: 10.1371/journal.pgen.1008922] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2019] [Revised: 07/27/2020] [Accepted: 06/09/2020] [Indexed: 01/25/2023] Open
Abstract
A challenge in medical genomics is to identify variants and genes associated with severe genetic disorders. Based on the premise that severe, early-onset disorders often result in a reduction of evolutionary fitness, several statistical methods have been developed to predict pathogenic variants or constrained genes based on the signatures of negative selection in human populations. However, we currently lack a statistical framework to jointly predict deleterious variants and constrained genes from both variant-level features and gene-level selective constraints. Here we present such a unified approach, UNEECON, based on deep learning and population genetics. UNEECON treats the contributions of variant-level features and gene-level constraints as a variant-level fixed effect and a gene-level random effect, respectively. The sum of the fixed and random effects is then combined with an evolutionary model to infer the strength of negative selection at both variant and gene levels. Compared with previously published methods, UNEECON shows improved performance in predicting missense variants and protein-coding genes associated with autosomal dominant disorders, and feature importance analysis suggests that both gene-level selective constraints and variant-level predictors are important for accurate variant prioritization. Furthermore, based on UNEECON, we observe a low correlation between gene-level intolerance to missense mutations and that to loss-of-function mutations, which can be partially explained by the prevalence of disordered protein regions that are highly tolerant to missense mutations. Finally, we show that genes intolerant to both missense and loss-of-function mutations play key roles in the central nervous system and the autism spectrum disorders. Overall, UNEECON is a promising framework for both variant and gene prioritization. Numerous statistical methods have been developed to predict deleterious missense variants or constrained genes in the human genome, but unified prioritization methods that utilize both variant- and gene-level information are underdeveloped. Here we present UNEECON, an evolution-based deep learning framework for unified variant and gene prioritization. By integrating variant-level predictors and gene-level selective constraints, UNEECON outperforms existing methods in predicting missense variants and protein-coding genes associated with dominant disorders. Based on UNEECON, we show that disordered proteins are tolerant to missense mutations but not to loss-of-function mutations. In addition, we find that genes under strong selective constraints at both missense and loss-of-function levels are strongly associated with the central nervous system and the autism spectrum disorders, highlighting the need to investigate the function of these highly constrained genes in future studies.
Collapse
Affiliation(s)
- Yi-Fei Huang
- Department of Biology, Pennsylvania State University, University Park, Pennsylvania, United States of America
- Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, Pennsylvania, United States of America
- * E-mail:
| |
Collapse
|
54
|
Michelson DJ, Clark RD. Optimizing Genetic Diagnosis of Neurodevelopmental Disorders in the Clinical Setting. Clin Lab Med 2020; 40:231-256. [PMID: 32718497 DOI: 10.1016/j.cll.2020.05.001] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Abstract
Progress in medical genetics has changed the practice of medicine in general and child neurology in particular. A genetic diagnosis has become critically important in determining optimal management of many neurodevelopmental disorders, making genetic testing a routine consideration of patient care in outpatient and inpatient settings. Today's child neurologists should be familiar with various genetic testing modalities and their appropriate use. Molecular genetic testing of children with unexplained developmental delays and/or congenital anomalies has a 20% to 30% chance of identifying a causative etiology. Newer methods have made genetic testing more widely available and sensitive but also more likely to produce ambiguous results.
Collapse
Affiliation(s)
- David Joshua Michelson
- Division of Child Neurology, Department of Pediatrics, Loma Linda University School of Medicine, Coleman Pavilion Room A, 1175 Campus Street, Loma Linda, CA 92354, USA.
| | - Robin Dawn Clark
- Division of Medical Genetics, Department of Pediatrics, Loma Linda University School of Medicine, Coleman Pavilion Room A, 1175 Campus Street, Loma Linda, CA 92354, USA
| |
Collapse
|
55
|
Abel HJ, Larson DE, Regier AA, Chiang C, Das I, Kanchi KL, Layer RM, Neale BM, Salerno WJ, Reeves C, Buyske S, Matise TC, Muzny DM, Zody MC, Lander ES, Dutcher SK, Stitziel NO, Hall IM. Mapping and characterization of structural variation in 17,795 human genomes. Nature 2020; 583:83-89. [PMID: 32460305 PMCID: PMC7547914 DOI: 10.1038/s41586-020-2371-0] [Citation(s) in RCA: 181] [Impact Index Per Article: 36.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2018] [Accepted: 05/18/2020] [Indexed: 12/18/2022]
Abstract
A key goal of whole-genome sequencing for studies of human genetics is to interrogate all forms of variation, including single-nucleotide variants, small insertion or deletion (indel) variants and structural variants. However, tools and resources for the study of structural variants have lagged behind those for smaller variants. Here we used a scalable pipeline1 to map and characterize structural variants in 17,795 deeply sequenced human genomes. We publicly release site-frequency data to create the largest, to our knowledge, whole-genome-sequencing-based structural variant resource so far. On average, individuals carry 2.9 rare structural variants that alter coding regions; these variants affect the dosage or structure of 4.2 genes and account for 4.0-11.2% of rare high-impact coding alleles. Using a computational model, we estimate that structural variants account for 17.2% of rare alleles genome-wide, with predicted deleterious effects that are equivalent to loss-of-function coding alleles; approximately 90% of such structural variants are noncoding deletions (mean 19.1 per genome). We report 158,991 ultra-rare structural variants and show that 2% of individuals carry ultra-rare megabase-scale structural variants, nearly half of which are balanced or complex rearrangements. Finally, we infer the dosage sensitivity of genes and noncoding elements, and reveal trends that relate to element class and conservation. This work will help to guide the analysis and interpretation of structural variants in the era of whole-genome sequencing.
Collapse
Affiliation(s)
- Haley J Abel
- McDonnell Genome Institute, Washington University School of Medicine, St Louis, MO, USA
- Department of Genetics, Washington University School of Medicine, St Louis, MO, USA
| | - David E Larson
- McDonnell Genome Institute, Washington University School of Medicine, St Louis, MO, USA
- Department of Genetics, Washington University School of Medicine, St Louis, MO, USA
| | - Allison A Regier
- McDonnell Genome Institute, Washington University School of Medicine, St Louis, MO, USA
- Department of Medicine, Washington University School of Medicine, St Louis, MO, USA
| | - Colby Chiang
- McDonnell Genome Institute, Washington University School of Medicine, St Louis, MO, USA
| | - Indraniel Das
- McDonnell Genome Institute, Washington University School of Medicine, St Louis, MO, USA
| | - Krishna L Kanchi
- McDonnell Genome Institute, Washington University School of Medicine, St Louis, MO, USA
| | - Ryan M Layer
- BioFrontiers Institute, University of Colorado, Boulder, CO, USA
- Department of Computer Science, University of Colorado, Boulder, CO, USA
| | - Benjamin M Neale
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
| | - William J Salerno
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | | | - Steven Buyske
- Department of Statistics, Rutgers University, Piscataway, NJ, USA
| | - Tara C Matise
- Department of Genetics, Rutgers University, Piscataway, NJ, USA
| | - Donna M Muzny
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | | | - Eric S Lander
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Susan K Dutcher
- McDonnell Genome Institute, Washington University School of Medicine, St Louis, MO, USA
- Department of Genetics, Washington University School of Medicine, St Louis, MO, USA
| | - Nathan O Stitziel
- McDonnell Genome Institute, Washington University School of Medicine, St Louis, MO, USA
- Department of Genetics, Washington University School of Medicine, St Louis, MO, USA
- Department of Medicine, Washington University School of Medicine, St Louis, MO, USA
| | - Ira M Hall
- McDonnell Genome Institute, Washington University School of Medicine, St Louis, MO, USA.
- Department of Genetics, Washington University School of Medicine, St Louis, MO, USA.
- Department of Medicine, Washington University School of Medicine, St Louis, MO, USA.
| |
Collapse
|
56
|
Abstract
A key goal of whole-genome sequencing for studies of human genetics is to interrogate all forms of variation, including single-nucleotide variants, small insertion or deletion (indel) variants and structural variants. However, tools and resources for the study of structural variants have lagged behind those for smaller variants. Here we used a scalable pipeline1 to map and characterize structural variants in 17,795 deeply sequenced human genomes. We publicly release site-frequency data to create the largest, to our knowledge, whole-genome-sequencing-based structural variant resource so far. On average, individuals carry 2.9 rare structural variants that alter coding regions; these variants affect the dosage or structure of 4.2 genes and account for 4.0-11.2% of rare high-impact coding alleles. Using a computational model, we estimate that structural variants account for 17.2% of rare alleles genome-wide, with predicted deleterious effects that are equivalent to loss-of-function coding alleles; approximately 90% of such structural variants are noncoding deletions (mean 19.1 per genome). We report 158,991 ultra-rare structural variants and show that 2% of individuals carry ultra-rare megabase-scale structural variants, nearly half of which are balanced or complex rearrangements. Finally, we infer the dosage sensitivity of genes and noncoding elements, and reveal trends that relate to element class and conservation. This work will help to guide the analysis and interpretation of structural variants in the era of whole-genome sequencing.
Collapse
|
57
|
Schwarz JM, Hombach D, Köhler S, Cooper DN, Schuelke M, Seelow D. RegulationSpotter: annotation and interpretation of extratranscriptic DNA variants. Nucleic Acids Res 2020; 47:W106-W113. [PMID: 31106382 PMCID: PMC6602480 DOI: 10.1093/nar/gkz327] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2019] [Revised: 04/17/2019] [Accepted: 05/09/2019] [Indexed: 02/07/2023] Open
Abstract
RegulationSpotter is a web-based tool for the user-friendly annotation and interpretation of DNA variants located outside of protein-coding transcripts (extratranscriptic variants). It is designed for clinicians and researchers who wish to assess the potential impact of the considerable number of non-coding variants found in Whole Genome Sequencing runs. It annotates individual variants with underlying regulatory features in an intuitive way by assessing over 100 genome-wide annotations. Additionally, it calculates a score, which reflects the regulatory potential of the variant region. Its dichotomous classifications, ‘functional’ or ‘non-functional’, and a human-readable presentation of the underlying evidence allow a biologically meaningful interpretation of the score. The output shows key aspects of every variant and allows rapid access to more detailed information about its possible role in gene regulation. RegulationSpotter can either analyse single variants or complete VCF files. Variants located within protein-coding transcripts are automatically assessed by MutationTaster as well as by RegulationSpotter to account for possible intragenic regulatory effects. RegulationSpotter offers the possibility of using phenotypic data to focus on known disease genes or genomic elements interacting with them. RegulationSpotter is freely available at https://www.regulationspotter.org.
Collapse
Affiliation(s)
- Jana Marie Schwarz
- Department of Neuropediatrics, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health (BIH), Berlin, Germany.,Centrum für Therapieforschung, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health (BIH), Berlin, Germany.,NeuroCure Cluster of Excellence and NeuroCure Clinical Research Center, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health (BIH), Berlin, Germany
| | - Daniela Hombach
- Centrum für Therapieforschung, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health (BIH), Berlin, Germany.,NeuroCure Cluster of Excellence and NeuroCure Clinical Research Center, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health (BIH), Berlin, Germany
| | - Sebastian Köhler
- Centrum für Therapieforschung, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health (BIH), Berlin, Germany.,Berlin Institute of Health (BIH), Berlin, Germany.,Einstein Center for Digital Future, Berlin, Germany
| | - David N Cooper
- Institute of Medical Genetics, Cardiff University, Cardiff, UK
| | - Markus Schuelke
- Department of Neuropediatrics, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health (BIH), Berlin, Germany.,NeuroCure Cluster of Excellence and NeuroCure Clinical Research Center, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health (BIH), Berlin, Germany
| | - Dominik Seelow
- Centrum für Therapieforschung, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health (BIH), Berlin, Germany.,Berlin Institute of Health (BIH), Berlin, Germany
| |
Collapse
|
58
|
Zhang S, He Y, Liu H, Zhai H, Huang D, Yi X, Dong X, Wang Z, Zhao K, Zhou Y, Wang J, Yao H, Xu H, Yang Z, Sham PC, Chen K, Li MJ. regBase: whole genome base-wise aggregation and functional prediction for human non-coding regulatory variants. Nucleic Acids Res 2020; 47:e134. [PMID: 31511901 PMCID: PMC6868349 DOI: 10.1093/nar/gkz774] [Citation(s) in RCA: 41] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2019] [Accepted: 08/29/2019] [Indexed: 12/19/2022] Open
Abstract
Predicting the functional or pathogenic regulatory variants in the human non-coding genome facilitates the interpretation of disease causation. While numerous prediction methods are available, their performance is inconsistent or restricted to specific tasks, which raises the demand of developing comprehensive integration for those methods. Here, we compile whole genome base-wise aggregations, regBase, that incorporate largest prediction scores. Building on different assumptions of causality, we train three composite models to score functional, pathogenic and cancer driver non-coding regulatory variants respectively. We demonstrate the superior and stable performance of our models using independent benchmarks and show great success to fine-map causal regulatory variants on specific locus or at base-wise resolution. We believe that regBase database together with three composite models will be useful in different areas of human genetic studies, such as annotation-based casual variant fine-mapping, pathogenic variant discovery as well as cancer driver mutation identification. regBase is freely available at https://github.com/mulinlab/regBase.
Collapse
Affiliation(s)
- Shijie Zhang
- Department of Pharmacology, School of Basic Medical Sciences, Tianjin Key Laboratory of Inflammation Biology, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China
| | - Yukun He
- Department of Pharmacology, School of Basic Medical Sciences, Tianjin Key Laboratory of Inflammation Biology, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China
| | - Huanhuan Liu
- Department of Pharmacology, School of Basic Medical Sciences, Tianjin Key Laboratory of Inflammation Biology, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China
| | - Haoyu Zhai
- Department of Computer Science, University of Illinois Urbana-Champaign, IL, USA
| | - Dandan Huang
- Department of Pharmacology, School of Basic Medical Sciences, Tianjin Key Laboratory of Inflammation Biology, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China.,Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
| | - Xianfu Yi
- School of Biomedical Engineering, Tianjin Medical University, Tianjin, China
| | - Xiaobao Dong
- Department of Genetics, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
| | - Zhao Wang
- Department of Pharmacology, School of Basic Medical Sciences, Tianjin Key Laboratory of Inflammation Biology, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China
| | - Ke Zhao
- Department of Pharmacology, School of Basic Medical Sciences, Tianjin Key Laboratory of Inflammation Biology, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China
| | - Yao Zhou
- Department of Pharmacology, School of Basic Medical Sciences, Tianjin Key Laboratory of Inflammation Biology, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China
| | - Jianhua Wang
- Department of Pharmacology, School of Basic Medical Sciences, Tianjin Key Laboratory of Inflammation Biology, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China
| | - Hongcheng Yao
- School of Biomedical Sciences, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Hang Xu
- School of Biomedical Sciences, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Zhenglu Yang
- College of Computer Science, Nankai University, Tianjin, China
| | - Pak Chung Sham
- Centre of Genomics Sciences, State Key Laboratory of Brain and Cognitive Sciences, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Kexin Chen
- Department of Epidemiology and Biostatistics, Tianjin Key Laboratory of Molecular Cancer Epidemiology, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China
| | - Mulin Jun Li
- Department of Pharmacology, School of Basic Medical Sciences, Tianjin Key Laboratory of Inflammation Biology, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China.,Department of Epidemiology and Biostatistics, Tianjin Key Laboratory of Molecular Cancer Epidemiology, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China
| |
Collapse
|
59
|
Siskova A, Cervena K, Kral J, Hucl T, Vodicka P, Vymetalkova V. Colorectal Adenomas-Genetics and Searching for New Molecular Screening Biomarkers. Int J Mol Sci 2020; 21:ijms21093260. [PMID: 32380676 PMCID: PMC7247353 DOI: 10.3390/ijms21093260] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2020] [Revised: 05/01/2020] [Accepted: 05/02/2020] [Indexed: 02/06/2023] Open
Abstract
Colorectal cancer (CRC) is a malignant disease with an incidence of over 1.8 million new cases per year worldwide. CRC outcome is closely related to the respective stage of CRC and is more favorable at less advanced stages. Detection of early colorectal adenomas is the key to survival. In spite of implemented screening programs showing efficiency in the detection of early precancerous lesions and CRC in asymptomatic patients, a significant number of patients are still diagnosed in advanced stages. Research on CRC accomplished during the last decade has improved our understanding of the etiology and development of colorectal adenomas and revealed weaknesses in the general approach to their detection and elimination. Recent studies seek to find a reliable non-invasive biomarker detectable even in the blood. New candidate biomarkers could be selected on the basis of so-called liquid biopsy, such as long non-coding RNA, microRNA, circulating cell-free DNA, circulating tumor cells, and inflammatory factors released from the adenoma into circulation. In this work, we focused on both genetic and epigenetic changes associated with the development of colorectal adenomas into colorectal carcinoma and we also discuss new possible biomarkers that are detectable even in adenomas prior to cancer development.
Collapse
Affiliation(s)
- Anna Siskova
- Department of Molecular Biology of Cancer, Institute of Experimental Medicine, Videnska 1083, 14200 Prague, Czech Republic; (K.C.); (J.K.); (V.V.)
- Institute of Biology and Medical Genetics, First Faculty of Medicine, Charles University, Albertov 4, 12800 Prague, Czech Republic
- Correspondence: (A.S.); (P.V.); Tel.: +420-241062251 (A.S.); +420-241062694 (P.V.)
| | - Klara Cervena
- Department of Molecular Biology of Cancer, Institute of Experimental Medicine, Videnska 1083, 14200 Prague, Czech Republic; (K.C.); (J.K.); (V.V.)
- Institute of Biology and Medical Genetics, First Faculty of Medicine, Charles University, Albertov 4, 12800 Prague, Czech Republic
| | - Jan Kral
- Department of Molecular Biology of Cancer, Institute of Experimental Medicine, Videnska 1083, 14200 Prague, Czech Republic; (K.C.); (J.K.); (V.V.)
- Institute for Clinical and Experimental Medicine, Videnska 1958/9, 14021 Prague, Czech Republic;
| | - Tomas Hucl
- Institute for Clinical and Experimental Medicine, Videnska 1958/9, 14021 Prague, Czech Republic;
| | - Pavel Vodicka
- Department of Molecular Biology of Cancer, Institute of Experimental Medicine, Videnska 1083, 14200 Prague, Czech Republic; (K.C.); (J.K.); (V.V.)
- Institute of Biology and Medical Genetics, First Faculty of Medicine, Charles University, Albertov 4, 12800 Prague, Czech Republic
- Biomedical Centre, Faculty of Medicine in Pilsen, Charles University, Alej Svobody 76, 32300 Pilsen, Czech Republic
- Correspondence: (A.S.); (P.V.); Tel.: +420-241062251 (A.S.); +420-241062694 (P.V.)
| | - Veronika Vymetalkova
- Department of Molecular Biology of Cancer, Institute of Experimental Medicine, Videnska 1083, 14200 Prague, Czech Republic; (K.C.); (J.K.); (V.V.)
- Institute of Biology and Medical Genetics, First Faculty of Medicine, Charles University, Albertov 4, 12800 Prague, Czech Republic
- Biomedical Centre, Faculty of Medicine in Pilsen, Charles University, Alej Svobody 76, 32300 Pilsen, Czech Republic
| |
Collapse
|
60
|
Xu D, Gokcumen O, Khurana E. Loss-of-function tolerance of enhancers in the human genome. PLoS Genet 2020; 16:e1008663. [PMID: 32243438 PMCID: PMC7159235 DOI: 10.1371/journal.pgen.1008663] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2019] [Revised: 04/15/2020] [Accepted: 02/12/2020] [Indexed: 12/21/2022] Open
Abstract
Previous studies have surveyed the potential impact of loss-of-function (LoF) variants and identified LoF-tolerant protein-coding genes. However, the tolerance of human genomes to losing enhancers has not yet been evaluated. Here we present the catalog of LoF-tolerant enhancers using structural variants from whole-genome sequences. Using a conservative approach, we estimate that individual human genomes possess at least 28 LoF-tolerant enhancers on average. We assessed the properties of LoF-tolerant enhancers in a unified regulatory network constructed by integrating tissue-specific enhancers and gene-gene interactions. We find that LoF-tolerant enhancers tend to be more tissue-specific and regulate fewer and more dispensable genes relative to other enhancers. They are enriched in immune-related cells while enhancers with low LoF-tolerance are enriched in kidney and brain/neuronal stem cells. We developed a supervised learning approach to predict the LoF-tolerance of all enhancers, which achieved an area under the receiver operating characteristics curve (AUROC) of 98%. We predict 3,519 more enhancers would be likely tolerant to LoF and 129 enhancers that would have low LoF-tolerance. Our predictions are supported by a known set of disease enhancers and novel deletions from PacBio sequencing. The LoF-tolerance scores provided here will serve as an important reference for disease studies. Enhancers are elements where transcription factors bind and regulate the expression of protein-coding genes. Although multiple previous studies have focused on which genes can tolerate loss-of-function (LoF), none has systematically evaluated the tolerance of all enhancers in the human genome to LoF. Individual studies have shown a broad range of phenotypic effects of enhancer LoF. The phenotypic effects of enhancer LoF likely fall into a spectrum where deletion of LoF-tolerant enhancers would not elicit substantial phenotypic impact, while some enhancers are likely to cause fitness defects when deleted. Here we report a systematic computational approach that uses machine learning and properties of enhancers in a unified human regulatory network with tissue-specific annotations to predict the LoF-tolerance of all enhancers identified in the human genome. The LoF-tolerance scores of enhancers provided in this study can significantly facilitate the interpretation and prioritization of non-coding sequence variants for disease and functional studies.
Collapse
Affiliation(s)
- Duo Xu
- Institute for Computational Biomedicine, Weill Cornell Medicine, New York, New York, United States of America
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, New York, United States of America
- Englander Institute for Precision Medicine, New York Presbyterian Hospital-Weill Cornell Medicine, New York, New York, United States of America
- Meyer Cancer Center, Weill Cornell Medicine, New York, New York, United States of America
| | - Omer Gokcumen
- Department of Biological Sciences, University at Buffalo, The State University of New York, Buffalo, New York, United States of America
| | - Ekta Khurana
- Institute for Computational Biomedicine, Weill Cornell Medicine, New York, New York, United States of America
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, New York, United States of America
- Englander Institute for Precision Medicine, New York Presbyterian Hospital-Weill Cornell Medicine, New York, New York, United States of America
- Meyer Cancer Center, Weill Cornell Medicine, New York, New York, United States of America
- * E-mail:
| |
Collapse
|
61
|
Kumar S, Warrell J, Li S, McGillivray PD, Meyerson W, Salichos L, Harmanci A, Martinez-Fundichely A, Chan CWY, Nielsen MM, Lochovsky L, Zhang Y, Li X, Lou S, Pedersen JS, Herrmann C, Getz G, Khurana E, Gerstein MB. Passenger Mutations in More Than 2,500 Cancer Genomes: Overall Molecular Functional Impact and Consequences. Cell 2020; 180:915-927.e16. [PMID: 32084333 PMCID: PMC7210002 DOI: 10.1016/j.cell.2020.01.032] [Citation(s) in RCA: 89] [Impact Index Per Article: 17.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2018] [Revised: 08/23/2019] [Accepted: 01/29/2020] [Indexed: 01/23/2023]
Abstract
The dichotomous model of "drivers" and "passengers" in cancer posits that only a few mutations in a tumor strongly affect its progression, with the remaining ones being inconsequential. Here, we leveraged the comprehensive variant dataset from the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) project to demonstrate that-in addition to the dichotomy of high- and low-impact variants-there is a third group of medium-impact putative passengers. Moreover, we also found that molecular impact correlates with subclonal architecture (i.e., early versus late mutations), and different signatures encode for mutations with divergent impact. Furthermore, we adapted an additive-effects model from complex-trait studies to show that the aggregated effect of putative passengers, including undetected weak drivers, provides significant additional power (∼12% additive variance) for predicting cancerous phenotypes, beyond PCAWG-identified driver mutations. Finally, this framework allowed us to estimate the frequency of potential weak-driver mutations in PCAWG samples lacking any well-characterized driver alterations.
Collapse
Affiliation(s)
- Sushant Kumar
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
| | - Jonathan Warrell
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
| | - Shantao Li
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
| | - Patrick D McGillivray
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA; Yale School of Medicine, Yale University, New Haven, CT 06510, USA
| | - William Meyerson
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA; Yale School of Medicine, Yale University, New Haven, CT 06510, USA
| | - Leonidas Salichos
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
| | - Arif Harmanci
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA; Center for Precision Health, School of Biomedical Informatics, University of Texas Health Sciences Center, Houston, TX 77030, USA
| | - Alexander Martinez-Fundichely
- Institute for Computational Biomedicine, Weill Cornell Medical College, New York, NY 10021, USA; Department of Physiology and Biophysics, Weill Cornell Medicine, 1300 York Avenue, New York, NY 10065, USA; Caryl and Israel Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Calvin W Y Chan
- Division of Theoretical Bioinformatics, German Cancer Research Center (DKFZ), Heidelberg 69120, Germany; Faculty of Biosciences, Heidelberg University, Heidelberg 69120, Germany
| | - Morten Muhlig Nielsen
- Department of Molecular Medicine (MOMA), Aarhus University Hospital, Aarhus, Denmark
| | - Lucas Lochovsky
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
| | - Yan Zhang
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA; Department of Biomedical Informatics, College of Medicine, Ohio State University, Columbus, OH 43210, USA; The Ohio State University Comprehensive Cancer Center (OSUCCC-James), Columbus, OH 43210, USA
| | - Xiaotong Li
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA
| | - Shaoke Lou
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
| | - Jakob Skou Pedersen
- Department of Molecular Medicine (MOMA), Aarhus University Hospital, Aarhus, Denmark; Bioinformatics Research Centre (BiRC), Aarhus University, Aarhus, Denmark
| | - Carl Herrmann
- Division of Theoretical Bioinformatics, German Cancer Research Center (DKFZ), Heidelberg 69120, Germany; Health Data Science Unit, Medical Faculty Heidelberg and BioQuant, Heidelberg 69120, Germany
| | - Gad Getz
- The Broad Institute of MIT and Harvard, Cambridge, MA 02124, USA; Massachusetts General Hospital Center for Cancer Research, Charlestown, MA 02129, USA; Harvard Medical School, 250 Longwood Avenue, Boston, MA 02115, USA
| | - Ekta Khurana
- Institute for Computational Biomedicine, Weill Cornell Medical College, New York, NY 10021, USA; Department of Physiology and Biophysics, Weill Cornell Medicine, 1300 York Avenue, New York, NY 10065, USA; Caryl and Israel Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY, USA; Meyer Cancer Center, Weill Cornell Medicine, New York, NY, USA
| | - Mark B Gerstein
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA; Department of Computer Science, Yale University, New Haven, CT 06511, USA.
| |
Collapse
|
62
|
Venkat S, Tisdale AA, Schwarz JR, Alahmari AA, Maurer HC, Olive KP, Eng KH, Feigin ME. Alternative polyadenylation drives oncogenic gene expression in pancreatic ductal adenocarcinoma. Genome Res 2020; 30:347-360. [PMID: 32029502 PMCID: PMC7111527 DOI: 10.1101/gr.257550.119] [Citation(s) in RCA: 54] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2019] [Accepted: 02/04/2020] [Indexed: 01/08/2023]
Abstract
Alternative polyadenylation (APA) is a gene regulatory process that dictates mRNA 3'-UTR length, resulting in changes in mRNA stability and localization. APA is frequently disrupted in cancer and promotes tumorigenesis through altered expression of oncogenes and tumor suppressors. Pan-cancer analyses have revealed common APA events across the tumor landscape; however, little is known about tumor type-specific alterations that may uncover novel events and vulnerabilities. Here, we integrate RNA-sequencing data from the Genotype-Tissue Expression (GTEx) project and The Cancer Genome Atlas (TCGA) to comprehensively analyze APA events in 148 pancreatic ductal adenocarcinomas (PDACs). We report widespread, recurrent, and functionally relevant 3'-UTR alterations associated with gene expression changes of known and newly identified PDAC growth-promoting genes and experimentally validate the effects of these APA events on protein expression. We find enrichment for APA events in genes associated with known PDAC pathways, loss of tumor-suppressive miRNA binding sites, and increased heterogeneity in 3'-UTR forms of metabolic genes. Survival analyses reveal a subset of 3'-UTR alterations that independently characterize a poor prognostic cohort among PDAC patients. Finally, we identify and validate the casein kinase CSNK1A1 (also known as CK1alpha or CK1a) as an APA-regulated therapeutic target in PDAC. Knockdown or pharmacological inhibition of CSNK1A1 attenuates PDAC cell proliferation and clonogenic growth. Our single-cancer analysis reveals APA as an underappreciated driver of protumorigenic gene expression in PDAC via the loss of miRNA regulation.
Collapse
Affiliation(s)
- Swati Venkat
- Department of Pharmacology and Therapeutics, Roswell Park Comprehensive Cancer Center, Buffalo, New York 14263, USA
| | - Arwen A Tisdale
- Department of Pharmacology and Therapeutics, Roswell Park Comprehensive Cancer Center, Buffalo, New York 14263, USA
| | - Johann R Schwarz
- Department of Pharmacology and Therapeutics, Roswell Park Comprehensive Cancer Center, Buffalo, New York 14263, USA
| | - Abdulrahman A Alahmari
- Department of Pharmacology and Therapeutics, Roswell Park Comprehensive Cancer Center, Buffalo, New York 14263, USA
| | - H Carlo Maurer
- Klinikum rechts der Isar, II. Medizinische Klinik, Technische Universität München, 81675 Munich, Germany
| | - Kenneth P Olive
- Herbert Irving Comprehensive Cancer Center, Department of Medicine, Division of Digestive and Liver Diseases, Department of Pathology and Cell Biology, Columbia University Medical Center, New York, New York 10032, USA
| | - Kevin H Eng
- Department of Cancer Genetics and Genomics, Roswell Park Comprehensive Cancer Center, Buffalo, New York 14263, USA
- Department of Biostatistics and Bioinformatics, Roswell Park Comprehensive Cancer Center, Buffalo, New York 14263, USA
| | - Michael E Feigin
- Department of Pharmacology and Therapeutics, Roswell Park Comprehensive Cancer Center, Buffalo, New York 14263, USA
| |
Collapse
|
63
|
Rheinbay E, Nielsen MM, Abascal F, Wala JA, Shapira O, Tiao G, Hornshøj H, Hess JM, Juul RI, Lin Z, Feuerbach L, Sabarinathan R, Madsen T, Kim J, Mularoni L, Shuai S, Lanzós A, Herrmann C, Maruvka YE, Shen C, Amin SB, Bandopadhayay P, Bertl J, Boroevich KA, Busanovich J, Carlevaro-Fita J, Chakravarty D, Chan CWY, Craft D, Dhingra P, Diamanti K, Fonseca NA, Gonzalez-Perez A, Guo Q, Hamilton MP, Haradhvala NJ, Hong C, Isaev K, Johnson TA, Juul M, Kahles A, Kahraman A, Kim Y, Komorowski J, Kumar K, Kumar S, Lee D, Lehmann KV, Li Y, Liu EM, Lochovsky L, Park K, Pich O, Roberts ND, Saksena G, Schumacher SE, Sidiropoulos N, Sieverling L, Sinnott-Armstrong N, Stewart C, Tamborero D, Tubio JMC, Umer HM, Uusküla-Reimand L, Wadelius C, Wadi L, Yao X, Zhang CZ, Zhang J, Haber JE, Hobolth A, Imielinski M, Kellis M, Lawrence MS, von Mering C, Nakagawa H, Raphael BJ, Rubin MA, Sander C, Stein LD, Stuart JM, Tsunoda T, Wheeler DA, Johnson R, Reimand J, Gerstein M, Khurana E, Campbell PJ, López-Bigas N, Weischenfeldt J, Beroukhim R, Martincorena I, Pedersen JS, Getz G. Analyses of non-coding somatic drivers in 2,658 cancer whole genomes. Nature 2020; 578:102-111. [PMID: 32025015 PMCID: PMC7054214 DOI: 10.1038/s41586-020-1965-x] [Citation(s) in RCA: 400] [Impact Index Per Article: 80.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2018] [Accepted: 12/02/2019] [Indexed: 01/28/2023]
Abstract
The discovery of drivers of cancer has traditionally focused on protein-coding genes1-4. Here we present analyses of driver point mutations and structural variants in non-coding regions across 2,658 genomes from the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium5 of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA). For point mutations, we developed a statistically rigorous strategy for combining significance levels from multiple methods of driver discovery that overcomes the limitations of individual methods. For structural variants, we present two methods of driver discovery, and identify regions that are significantly affected by recurrent breakpoints and recurrent somatic juxtapositions. Our analyses confirm previously reported drivers6,7, raise doubts about others and identify novel candidates, including point mutations in the 5' region of TP53, in the 3' untranslated regions of NFKBIZ and TOB1, focal deletions in BRD4 and rearrangements in the loci of AKR1C genes. We show that although point mutations and structural variants that drive cancer are less frequent in non-coding genes and regulatory sequences than in protein-coding genes, additional examples of these drivers will be found as more cancer genomes become available.
Collapse
Affiliation(s)
- Esther Rheinbay
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA, USA
- Harvard Medical School, Boston, MA, USA
| | - Morten Muhlig Nielsen
- Department of Molecular Medicine (MOMA), Aarhus University Hospital, Aarhus, Denmark
| | | | - Jeremiah A Wala
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Bioinformatics and Integrative Genomics, Harvard University, Cambridge, MA, USA
| | - Ofer Shapira
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Grace Tiao
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Henrik Hornshøj
- Department of Molecular Medicine (MOMA), Aarhus University Hospital, Aarhus, Denmark
| | - Julian M Hess
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Randi Istrup Juul
- Department of Molecular Medicine (MOMA), Aarhus University Hospital, Aarhus, Denmark
| | - Ziao Lin
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Harvard University, Cambridge, MA, USA
| | - Lars Feuerbach
- Division of Applied Bioinformatics, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Radhakrishnan Sabarinathan
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Spain
- Research Program on Biomedical Informatics, Universitat Pompeu Fabra, Barcelona, Spain
| | - Tobias Madsen
- Department of Molecular Medicine (MOMA), Aarhus University Hospital, Aarhus, Denmark
| | - Jaegil Kim
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Loris Mularoni
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Spain
- Research Program on Biomedical Informatics, Universitat Pompeu Fabra, Barcelona, Spain
| | - Shimin Shuai
- Computational Biology Program, Ontario Institute for Cancer Research, Toronto, Ontario, Canada
- Ontario Institute for Cancer Research, Toronto, Ontario, Canada
| | - Andrés Lanzós
- Department for BioMedical Research, University of Bern, Bern, Switzerland
- Graduate School of Cellular and Biomedical Sciences, University of Bern, Bern, Switzerland
- Department of Medical Oncology, Bern University Hospital, University of Bern, Bern, Switzerland
| | - Carl Herrmann
- Division of Theoretical Bioinformatics, German Cancer Research Center (DKFZ), Heidelberg, Germany
- Bioquant Center, Institute of Pharmacy and Molecular Biotechnology, University of Heidelberg, Heidelberg, Germany
| | - Yosef E Maruvka
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA, USA
| | - Ciyue Shen
- Department of Cell Biology, Harvard Medical School, Boston, MA, USA
- cBio Center, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
| | - Samirkumar B Amin
- Department of Genomic Medicine, University of Texas MD Anderson Cancer Center, Houston, TX, USA
- Graduate Program in Structural and Computational Biology and Molecular Biophysics, Baylor College of Medicine, Houston, TX, USA
| | - Pratiti Bandopadhayay
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Johanna Bertl
- Department of Molecular Medicine (MOMA), Aarhus University Hospital, Aarhus, Denmark
| | - Keith A Boroevich
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - John Busanovich
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Joana Carlevaro-Fita
- Department for BioMedical Research, University of Bern, Bern, Switzerland
- Graduate School of Cellular and Biomedical Sciences, University of Bern, Bern, Switzerland
- Department of Medical Oncology, Bern University Hospital, University of Bern, Bern, Switzerland
| | - Dimple Chakravarty
- Department of Genitourinary Medical Oncology - Research, Division of Cancer Medicine, University of Texas MD Anderson Cancer Center, Houston, TX, USA
- Department of Urology, Icahn school of Medicine at Mount Sinai, New York, NY, USA
| | - Calvin Wing Yiu Chan
- Division of Theoretical Bioinformatics, German Cancer Research Center (DKFZ), Heidelberg, Germany
- Faculty of Biosciences, Heidelberg University, Heidelberg, Germany
| | - David Craft
- Department of Radiation Oncology, Harvard Medical School, Massachusetts General Hospital, Boston, MA, USA
| | - Priyanka Dhingra
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
- Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA
| | - Klev Diamanti
- Science for Life Laboratory, Department of Cell and Molecular Biology, Uppsala University, Uppsala, Sweden
| | - Nuno A Fonseca
- European Bioinformatics Institute, European Molecular Biology Laboratory, Hinxton, UK
| | - Abel Gonzalez-Perez
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Spain
- Research Program on Biomedical Informatics, Universitat Pompeu Fabra, Barcelona, Spain
| | - Qianyun Guo
- Bioinformatics Research Centre (BiRC), Aarhus University, Aarhus, Denmark
| | - Mark P Hamilton
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, TX, USA
| | - Nicholas J Haradhvala
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA, USA
| | - Chen Hong
- Division of Applied Bioinformatics, German Cancer Research Center (DKFZ), Heidelberg, Germany
- Faculty of Biosciences, Heidelberg University, Heidelberg, Germany
| | - Keren Isaev
- Computational Biology Program, Ontario Institute for Cancer Research, Toronto, Ontario, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada
| | - Todd A Johnson
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Malene Juul
- Department of Molecular Medicine (MOMA), Aarhus University Hospital, Aarhus, Denmark
| | - Andre Kahles
- Division of Computational Biology, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Abdullah Kahraman
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, Zurich, Switzerland
| | - Youngwook Kim
- Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, South Korea
| | - Jan Komorowski
- Science for Life Laboratory, Department of Cell and Molecular Biology, Uppsala University, Uppsala, Sweden
- Institute of Computer Science, Polish Academy of Sciences, Warsaw, Poland
| | - Kiran Kumar
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Sushant Kumar
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA
| | - Donghoon Lee
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA
| | - Kjong-Van Lehmann
- Division of Computational Biology, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Yilong Li
- SBGD Inc, Cambridge, MA, USA
- Department of Haematology, University of Cambridge, Cambridge, UK
| | - Eric Minwei Liu
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
- Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA
| | - Lucas Lochovsky
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Keunchil Park
- Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, South Korea
| | - Oriol Pich
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Spain
- Research Program on Biomedical Informatics, Universitat Pompeu Fabra, Barcelona, Spain
| | - Nicola D Roberts
- Department of Haematology, University of Cambridge, Cambridge, UK
| | - Gordon Saksena
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Steven E Schumacher
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Nikos Sidiropoulos
- Biotech Research & Innovation Centre (BRIC), The Finsen Laboratory, Rigshospitalet, University of Copenhagen, Copenhagen, Denmark
| | - Lina Sieverling
- Division of Applied Bioinformatics, German Cancer Research Center (DKFZ), Heidelberg, Germany
- Faculty of Biosciences, Heidelberg University, Heidelberg, Germany
| | | | - Chip Stewart
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - David Tamborero
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Spain
- Research Program on Biomedical Informatics, Universitat Pompeu Fabra, Barcelona, Spain
| | - Jose M C Tubio
- Department of Zoology, Genetics and Physical Anthropology, Universidade de Santiago de Compostela, Santiago de Compostela, Spain
- Centre for Research in Molecular Medicine and Chronic Diseases (CIMUS), Universidade de Santiago de Compostela, Santiago de Compostela, Spain
- The Biomedical Research Centre (CINBIO), Universidade de Vigo, Vigo, Spain
| | - Husen M Umer
- Science for Life Laboratory, Department of Cell and Molecular Biology, Uppsala University, Uppsala, Sweden
- Department of Oncology-Pathology, Science for Life Laboratory, Karolinska Institute, Stockholm, Sweden
| | - Liis Uusküla-Reimand
- Genetics and Genome Biology Program, SickKids Research Institute, Toronto, Ontario, Canada
- Department of Gene Technology, Tallinn University of Technology, Tallinn, Estonia
| | - Claes Wadelius
- Science for Life Laboratory, Department of Immunology, Genetics and Pathology, Uppsala University, Uppsala, Sweden
| | - Lina Wadi
- Computational Biology Program, Ontario Institute for Cancer Research, Toronto, Ontario, Canada
| | | | - Cheng-Zhong Zhang
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Jing Zhang
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA
| | - James E Haber
- Department of Biology and Rosenstiel Basic Medical Sciences Research Center, Brandeis University, Waltham, MA, USA
| | - Asger Hobolth
- Bioinformatics Research Centre (BiRC), Aarhus University, Aarhus, Denmark
| | - Marcin Imielinski
- New York Genome Center, New York, NY, USA
- Department of Pathology and Laboratory Medicine, and Englander Institute for Precision Medicine, and Institute for Computational Biomedicine, and Meyer Cancer Center, Weill Cornell Medicine, New York, NY, USA
| | - Manolis Kellis
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA, USA
| | - Michael S Lawrence
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA, USA
| | - Christian von Mering
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, Zurich, Switzerland
| | - Hidewaki Nakagawa
- Laboratory for Cancer Genomics, RIKEN Center for Integrative Medical Sciences, Tokyo, Japan
| | - Benjamin J Raphael
- Department of Computer Science, Princeton University, Princeton, NJ, USA
| | - Mark A Rubin
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA
- Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY, USA
- Meyer Cancer Center, Weill Cornell Medicine, New York, NY, USA
| | - Chris Sander
- Department of Cell Biology, Harvard Medical School, Boston, MA, USA
- cBio Center, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
| | - Lincoln D Stein
- Computational Biology Program, Ontario Institute for Cancer Research, Toronto, Ontario, Canada
- Ontario Institute for Cancer Research, Toronto, Ontario, Canada
| | - Joshua M Stuart
- Center for Biomolecular Science and Engineering, University of California at Santa Cruz, Santa Cruz, CA, USA
| | - Tatsuhiko Tsunoda
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Department of Medical Science Mathematics, Medical Research Institute, Tokyo Medical and Dental University, Tokyo, Japan
- Laboratory for Medical Science Mathematics, Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Tokyo, Japan
| | - David A Wheeler
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - Rory Johnson
- Department for BioMedical Research, University of Bern, Bern, Switzerland
- Department of Medical Oncology, Bern University Hospital, University of Bern, Bern, Switzerland
| | - Jüri Reimand
- Computational Biology Program, Ontario Institute for Cancer Research, Toronto, Ontario, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada
| | - Mark Gerstein
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
- Department of Computer Science, Yale University, New Haven, CT, USA
| | - Ekta Khurana
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
- Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA
- Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY, USA
- Meyer Cancer Center, Weill Cornell Medicine, New York, NY, USA
| | - Peter J Campbell
- Wellcome Trust Sanger Institute, Hinxton, UK
- Department of Haematology, University of Cambridge, Cambridge, UK
| | - Núria López-Bigas
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Spain
- Research Program on Biomedical Informatics, Universitat Pompeu Fabra, Barcelona, Spain
- Catalan Institution for Research and Advanced Studies (ICREA), Barcelona, Spain
| | - Joachim Weischenfeldt
- Biotech Research & Innovation Centre (BRIC), The Finsen Laboratory, Rigshospitalet, University of Copenhagen, Copenhagen, Denmark.
- Genome Biology Unit, European Molecular Biology Laboratory (EMBL), Heidelberg, Germany.
| | - Rameen Beroukhim
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Bioinformatics and Integrative Genomics, Harvard University, Cambridge, MA, USA.
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA.
| | | | - Jakob Skou Pedersen
- Department of Molecular Medicine (MOMA), Aarhus University Hospital, Aarhus, Denmark.
- Bioinformatics Research Centre (BiRC), Aarhus University, Aarhus, Denmark.
| | - Gad Getz
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA, USA.
- Harvard Medical School, Boston, MA, USA.
- Department of Pathology, Massachusetts General Hospital, Boston, MA, USA.
| |
Collapse
|
64
|
Zhu H, Uusküla-Reimand L, Isaev K, Wadi L, Alizada A, Shuai S, Huang V, Aduluso-Nwaobasi D, Paczkowska M, Abd-Rabbo D, Ocsenas O, Liang M, Thompson JD, Li Y, Ruan L, Krassowski M, Dzneladze I, Simpson JT, Lupien M, Stein LD, Boutros PC, Wilson MD, Reimand J. Candidate Cancer Driver Mutations in Distal Regulatory Elements and Long-Range Chromatin Interaction Networks. Mol Cell 2020; 77:1307-1321.e10. [PMID: 31954095 DOI: 10.1016/j.molcel.2019.12.027] [Citation(s) in RCA: 48] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2019] [Revised: 06/04/2019] [Accepted: 12/24/2019] [Indexed: 12/17/2022]
Abstract
A comprehensive catalog of cancer driver mutations is essential for understanding tumorigenesis and developing therapies. Exome-sequencing studies have mapped many protein-coding drivers, yet few non-coding drivers are known because genome-wide discovery is challenging. We developed a driver discovery method, ActiveDriverWGS, and analyzed 120,788 cis-regulatory modules (CRMs) across 1,844 whole tumor genomes from the ICGC-TCGA PCAWG project. We found 30 CRMs with enriched SNVs and indels (FDR < 0.05). These frequently mutated regulatory elements (FMREs) were ubiquitously active in human tissues, showed long-range chromatin interactions and mRNA abundance associations with target genes, and were enriched in motif-rewiring mutations and structural variants. Genomic deletion of one FMRE in human cells caused proliferative deficiencies and transcriptional deregulation of cancer genes CCNB1IP1, CDH1, and CDKN2B, validating observations in FMRE-mutated tumors. Pathway analysis revealed further sub-significant FMREs at cancer genes and processes, indicating an unexplored landscape of infrequent driver mutations in the non-coding genome.
Collapse
Affiliation(s)
- Helen Zhu
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada; Department of Medical Biophysics, University of Toronto, 101 College Street Suite 15-701, Toronto, ON M5G 1L7, Canada
| | - Liis Uusküla-Reimand
- Program in Genetics and Genome Biology, SickKids Research Institute, Peter Gilgan Centre for Research and Learning (PGCRL), 686 Bay Street, Toronto, ON M5G 0A4, Canada; Division of Gene Technology, Department of Chemistry and Biotechnology, Tallinn University of Technology, Akadeemia tee 15, Tallinn 12618, Estonia
| | - Keren Isaev
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada; Department of Medical Biophysics, University of Toronto, 101 College Street Suite 15-701, Toronto, ON M5G 1L7, Canada
| | - Lina Wadi
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada
| | - Azad Alizada
- Program in Genetics and Genome Biology, SickKids Research Institute, Peter Gilgan Centre for Research and Learning (PGCRL), 686 Bay Street, Toronto, ON M5G 0A4, Canada
| | - Shimin Shuai
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada; Department of Molecular Genetics, University of Toronto, 1 King's College Circle, Toronto, ON M5S 1A8, Canada
| | - Vincent Huang
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada
| | - Dike Aduluso-Nwaobasi
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada
| | - Marta Paczkowska
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada
| | - Diala Abd-Rabbo
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada
| | - Oliver Ocsenas
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada; Department of Medical Biophysics, University of Toronto, 101 College Street Suite 15-701, Toronto, ON M5G 1L7, Canada
| | - Minggao Liang
- Program in Genetics and Genome Biology, SickKids Research Institute, Peter Gilgan Centre for Research and Learning (PGCRL), 686 Bay Street, Toronto, ON M5G 0A4, Canada; Department of Molecular Genetics, University of Toronto, 1 King's College Circle, Toronto, ON M5S 1A8, Canada
| | - J Drew Thompson
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada
| | - Yao Li
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada
| | - Luyao Ruan
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada
| | - Michal Krassowski
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada
| | - Irakli Dzneladze
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada
| | - Jared T Simpson
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada; Department of Computer Science, University of Toronto, 214 College Street, Toronto, ON M5T 3A1, Canada
| | - Mathieu Lupien
- Department of Medical Biophysics, University of Toronto, 101 College Street Suite 15-701, Toronto, ON M5G 1L7, Canada; Princess Margaret Cancer Centre, 101 College Street, Toronto, ON M5G 0A3, Canada
| | - Lincoln D Stein
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada; Department of Molecular Genetics, University of Toronto, 1 King's College Circle, Toronto, ON M5S 1A8, Canada
| | - Paul C Boutros
- Department of Medical Biophysics, University of Toronto, 101 College Street Suite 15-701, Toronto, ON M5G 1L7, Canada; Department of Human Genetics, University of California Los Angeles, 10833 Le Conte Avenue, Los Angeles, CA 90095, USA; Department of Urology, University of California Los Angeles, 200 Medical Plaza Driveway #140, Los Angeles, CA 90024, USA; Institute of Precision Health, University of California Los Angeles, 10833 Le Conte Avenue, Los Angeles, CA 90024, USA; Jonsson Comprehensive Cancer Centre, University of California Los Angeles, 10833 Le Conte Avenue, Los Angeles, CA 90024, USA
| | - Michael D Wilson
- Program in Genetics and Genome Biology, SickKids Research Institute, Peter Gilgan Centre for Research and Learning (PGCRL), 686 Bay Street, Toronto, ON M5G 0A4, Canada; Department of Molecular Genetics, University of Toronto, 1 King's College Circle, Toronto, ON M5S 1A8, Canada
| | - Jüri Reimand
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada; Department of Medical Biophysics, University of Toronto, 101 College Street Suite 15-701, Toronto, ON M5G 1L7, Canada.
| |
Collapse
|
65
|
Xiao F, Zhang P, Wang Y, Tian Y, James M, Huang CC, Wang L, Wang L. Single-nucleotide polymorphism rs13426236 contributes to an increased prostate cancer risk via regulating MLPH splicing variant 4. Mol Carcinog 2020; 59:45-55. [PMID: 31659808 PMCID: PMC7219604 DOI: 10.1002/mc.23127] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2019] [Revised: 10/07/2019] [Accepted: 10/09/2019] [Indexed: 12/20/2022]
Abstract
A prostate cancer risk single-nucleotide polymorphism (SNP), rs13426236, is significantly associated with melanophilin (MLPH) expression. To functionally characterize role of the rs13426236 in prostate cancer, we first performed splicing-specific expression quantitative trait loci analysis and refined the significant association of rs13426236 allele G with an increased expression of MLPH splicing transcript variant 4 (V4) (P = 7.61E-5) but not other protein-coding variants (V1-V3) (P > .05). We then performed an allele-specific reporter assay to determine if SNP-containing sequences functioned as an active enhancer. Compared to allele A, allele G of rs13426236 showed significantly higher luciferase activity on the promoter of the splicing transcript V4 (P < .03) but not on the promoter of transcript V1 (P > .05) in two prostate cancer cell lines (DU145 and 22Rv1). Cell transfection assays showed stronger effect of transcript V4 than V1 on promoting cell proliferation, invasion, and antiapoptotic activities. RNA profiling analysis demonstrated that transcript V4 overexpression caused significant expression changes in glycosylation/glycoprotein and metal-binding gene ontology pathways (FDR < 0.01). We also found that both transcripts V4 and V1 were significantly upregulated in prostate adenocarcinoma (P ≤ 2.49E-6) but only transcript V4 upregulation was associated with poor recurrence-free survival (P = .028, hazard ratio = 1.63, 95% confidence interval = 1.05-2.42) in The Cancer Genome Atlas data. This study provides strong evidence showing that prostate cancer risk SNP rs13426236 upregulates expression of MLPH transcript V4, which may function as a candidate oncogene in prostate cancer.
Collapse
Affiliation(s)
- Fankai Xiao
- Henan Key Laboratory for Cancer Research, The First Affiliated Hospital of Zhengzhou University, 40 Daxue Road, Zhengzhou, Henan 450052, China
- Department of Pathology, MCW Cancer Center, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, Wisconsin 53226, USA
| | - Peng Zhang
- Department of Pathology, MCW Cancer Center, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, Wisconsin 53226, USA
| | - Yuan Wang
- Department of Pathology, MCW Cancer Center, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, Wisconsin 53226, USA
| | - Yijun Tian
- Department of Pathology, MCW Cancer Center, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, Wisconsin 53226, USA
| | - Michael James
- Department of Surgery, MCW Cancer Center, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, Wisconsin 53226, USA
| | - Chiang-Ching Huang
- Department of Biostatistics, University of Wisconsin, Milwaukee, Wisconsin 53201, USA
| | - Lidong Wang
- Henan Key Laboratory for Cancer Research, The First Affiliated Hospital of Zhengzhou University, 40 Daxue Road, Zhengzhou, Henan 450052, China
| | - Liang Wang
- Department of Pathology, MCW Cancer Center, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, Wisconsin 53226, USA
- Department of Tumor Biology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, Florida 33612, USA
| |
Collapse
|
66
|
Lu L, Liu H, Wu Y, Yan G. Development and Characterization of Near-Isogenic Lines Revealing Candidate Genes for a Major 7AL QTL Responsible for Heat Tolerance in Wheat. FRONTIERS IN PLANT SCIENCE 2020; 11:1316. [PMID: 32983205 PMCID: PMC7485290 DOI: 10.3389/fpls.2020.01316] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/11/2020] [Accepted: 08/11/2020] [Indexed: 05/22/2023]
Abstract
Wheat is one of the most important food crops in the world, but as a cool-season crop, it is more prone to heat stress, which severely affects crop production and grain quality. Heat tolerance in wheat is a quantitative trait, and the genes underlying reported quantitative trait loci (QTL) have rarely been identified. Near-isogenic lines (NILs) with a common genetic background but differing at a particular locus could turn quantitative traits into a Mendelian factor; therefore, they are suitable material for identifying candidate genes for targeted locus/loci. In this study, we developed and characterized NILs from two populations Cascades × Tevere and Cascades × W156 targeting a major 7AL QTL responsible for heat tolerance. Molecular marker screening and phenotyping for SPAD chlorophyll content and grain-yield-related traits confirmed four pairs of wheat NILs that contrasted for heat-stress responses. Genotyping the NILs using a 90K Infinium iSelect SNP array revealed five single nucleotide polymorphism (SNP) markers within the QTL interval that were distinguishable between the isolines. Seven candidate genes linked to the SNPs were identified as related to heat tolerance, and involved in important processes and pathways in response to heat stress. The confirmed multiple pairs of NILs and identified candidate genes in this study are valuable resources and information for further fine-mapping to clone major genes for heat tolerance.
Collapse
Affiliation(s)
- Lu Lu
- Faculty of Science, UWA School of Agriculture and Environment, The University of Western Australia, Perth, WA, Australia
- The UWA Institute of Agriculture, The University of Western Australia, Perth, WA, Australia
- Chengdu Institute of Biology, Chinese Academy of Sciences, Chengdu, China
| | - Hui Liu
- Faculty of Science, UWA School of Agriculture and Environment, The University of Western Australia, Perth, WA, Australia
- The UWA Institute of Agriculture, The University of Western Australia, Perth, WA, Australia
- *Correspondence: Hui Liu, ; Guijun Yan,
| | - Yu Wu
- Chengdu Institute of Biology, Chinese Academy of Sciences, Chengdu, China
| | - Guijun Yan
- Faculty of Science, UWA School of Agriculture and Environment, The University of Western Australia, Perth, WA, Australia
- The UWA Institute of Agriculture, The University of Western Australia, Perth, WA, Australia
- *Correspondence: Hui Liu, ; Guijun Yan,
| |
Collapse
|
67
|
Capasso M, Lasorsa VA, Cimmino F, Avitabile M, Cantalupo S, Montella A, De Angelis B, Morini M, de Torres C, Castellano A, Locatelli F, Iolascon A. Transcription Factors Involved in Tumorigenesis Are Over-Represented in Mutated Active DNA-Binding Sites in Neuroblastoma. Cancer Res 2019; 80:382-393. [PMID: 31784426 DOI: 10.1158/0008-5472.can-19-2883] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2019] [Revised: 10/24/2019] [Accepted: 11/22/2019] [Indexed: 11/16/2022]
Abstract
The contribution of coding mutations to oncogenesis has been largely clarified, whereas little is known about somatic mutations in noncoding DNA and their role in driving tumors remains controversial. Here, we used an alternative approach to interpret the functional significance of noncoding somatic mutations in promoting tumorigenesis. Noncoding somatic mutations of 151 neuroblastomas were integrated with ENCODE data to locate somatic mutations in regulatory elements specifically active in neuroblastoma cells, nonspecifically active in neuroblastoma cells, and nonactive. Within these types of elements, transcription factors (TF) were identified whose binding sites were enriched or depleted in mutations. For these TFs, a gene expression signature was built to assess their implication in neuroblastoma. DNA- and RNA-sequencing data were integrated to assess the effects of those mutations on mRNA levels. The pathogenicity of mutations was significantly higher in transcription factor binding site (TFBS) of regulatory elements specifically active in neuroblastoma cells, as compared with the others. Within these elements, there were 18 over-represented TFs involved mainly in cell-cycle phase transitions and 15 under-represented TFs primarily regulating cell differentiation. A gene expression signature based on over-represented TFs correlated with poor survival and unfavorable prognostic markers. Moreover, recurrent mutations in TFBS of over-represented TFs such as EZH2 affected MCF2L and ADP-ribosylhydrolase like 1 expression, among the others. We propose a novel approach to study the involvement of regulatory variants in neuroblastoma that could be extended to other cancers and provide further evidence that alterations of gene expression may have relevant effects in neuroblastoma development. SIGNIFICANCE: These findings propose a novel approach to study regulatory variants in neuroblastoma and suggest that noncoding somatic mutations have relevant implications in neuroblastoma development.
Collapse
Affiliation(s)
- Mario Capasso
- Department of Molecular Medicine and Medical Biotechnology, Università degli Studi di Napoli Federico II, Napoli, Italy. .,CEINGE Biotecnologie Avanzate, Napoli, Italy.,IRCCS SDN, Napoli, Italy
| | - Vito Alessandro Lasorsa
- Department of Molecular Medicine and Medical Biotechnology, Università degli Studi di Napoli Federico II, Napoli, Italy.,CEINGE Biotecnologie Avanzate, Napoli, Italy
| | - Flora Cimmino
- Department of Molecular Medicine and Medical Biotechnology, Università degli Studi di Napoli Federico II, Napoli, Italy.,CEINGE Biotecnologie Avanzate, Napoli, Italy
| | - Marianna Avitabile
- Department of Molecular Medicine and Medical Biotechnology, Università degli Studi di Napoli Federico II, Napoli, Italy.,CEINGE Biotecnologie Avanzate, Napoli, Italy
| | | | - Annalaura Montella
- Department of Molecular Medicine and Medical Biotechnology, Università degli Studi di Napoli Federico II, Napoli, Italy.,CEINGE Biotecnologie Avanzate, Napoli, Italy
| | - Biagio De Angelis
- Department of Pediatric Haematology and Oncology, IRCCS Ospedale Pediatrico Bambino Gesù, Roma, Italy
| | - Martina Morini
- Laboratory of Molecular Biology, IRCCS Istituto Giannina Gaslini, Genova, Italy
| | - Carmen de Torres
- Developmental Tumor Biology Laboratory, Department of Oncology, Hospital Sant Joan de Déu, Barcelona, Spain
| | - Aurora Castellano
- Department of Pediatric Haematology and Oncology, IRCCS Ospedale Pediatrico Bambino Gesù, Roma, Italy
| | - Franco Locatelli
- Department of Pediatric Haematology and Oncology, IRCCS Ospedale Pediatrico Bambino Gesù, Roma, Italy.,Department of Paediatrics, Sapienza University of Rome, Roma, Italy
| | - Achille Iolascon
- Department of Molecular Medicine and Medical Biotechnology, Università degli Studi di Napoli Federico II, Napoli, Italy. .,CEINGE Biotecnologie Avanzate, Napoli, Italy
| |
Collapse
|
68
|
Dutil J, Teer JK, Golubeva V, Yoder S, Tong WL, Arroyo N, Karam R, Echenique M, Matta JL, Monteiro AN. Germline variants in cancer genes in high-risk non-BRCA patients from Puerto Rico. Sci Rep 2019; 9:17769. [PMID: 31780696 PMCID: PMC6882826 DOI: 10.1038/s41598-019-54170-6] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2019] [Accepted: 11/05/2019] [Indexed: 12/30/2022] Open
Abstract
Inherited pathogenic variants in genes that confer moderate to high risk of breast cancer may explain up to 50% of familial breast cancer. This study aimed at identifying inherited pathogenic variants in breast cancer cases from Puerto Rico that were not linked to BRCA1 or BRCA2. Forty-eight breast cancer patients that met the clinical criteria for BRCA testing but had received a negative BRCA1/2 result were recruited. Fifty-three genes previously implicated in hereditary cancer predisposition were captured using the BROCA Agilent cancer risk panel followed by massively parallel sequencing. Missense variants of uncertain clinical significance in CHEK2 were evaluated using an in vitro kinase assays to determine their impact on function. Pathogenic variants were identified in CHEK2, MUTYH, and RAD51B in four breast cancer patients, which represented 8.3% of the cohort. We identified three rare missense variants of uncertain significance in CHEK2 and two variants (p.Pro484Leu and p.Glu239Lys) showed markedly decreased kinase activity in vitro comparable to a known pathogenic variant. Interestingly, the local ancestry at the RAD51B locus in the carrier of p.Arg47* was predicted to be of African origin. In this cohort, 12.5% of the BRCA-negative breast cancer patients were found to carry a known pathogenic variant or a variant affecting protein activity. This study reveals an unmet clinical need of genetic testing that could benefit a significant proportion of at-risk Latinas. It also highlights the complexity of Hispanic populations as pathogenic factors may originate from any of the ancestral populations that make up their genetic backgrounds.
Collapse
Affiliation(s)
- Julie Dutil
- Cancer Biology Division, Ponce Research Institute, Ponce Health Sciences University, Ponce, PR, USA.
| | - Jamie K Teer
- Department of Biostatistics and Bioinformatics, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, USA
| | - Volha Golubeva
- Cancer Epidemiology Program, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, USA
| | - Sean Yoder
- Molecular Genomics Core, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, USA
| | - Wei Lue Tong
- University of South Florida Morsani College of Medicine, Tampa, FL, USA
| | - Nelly Arroyo
- Cancer Biology Division, Ponce Research Institute, Ponce Health Sciences University, Ponce, PR, USA
| | | | - Miguel Echenique
- Auxilio Cancer Center, Auxilio Mutuo Hospital, San Juan, PR, USA
| | - Jaime L Matta
- Cancer Biology Division, Ponce Research Institute, Ponce Health Sciences University, Ponce, PR, USA
| | - Alvaro N Monteiro
- Cancer Epidemiology Program, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, USA
| |
Collapse
|
69
|
Liu Y, Liu B, Jin G, Zhang J, Wang X, Feng Y, Bian Z, Fei B, Yin Y, Huang Z. An Integrated Three-Long Non-coding RNA Signature Predicts Prognosis in Colorectal Cancer Patients. Front Oncol 2019; 9:1269. [PMID: 31824849 PMCID: PMC6883412 DOI: 10.3389/fonc.2019.01269] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2019] [Accepted: 11/04/2019] [Indexed: 01/25/2023] Open
Abstract
Colorectal cancer (CRC) is one of the most common cancers worldwide, whose morbidity and mortality gradually increased. Here, we aimed to identify and access prognostic long non-coding RNAs (lncRNAs) associated with overall survival (OS) in CRC. Firstly, RNA expression profiles were obtained from The Cancer Genome Atlas (TCGA) database, and 439 CRC patients were enrolled as a training set. Univariate Cox analysis and the least absolute shrinkage and selection operator analysis (LASSO) were performed to identify the prognostic lncRNAs. Multivariable Cox regression analysis was used to establish a prognostic risk formula including three lncRNAs (AP003555.2, AP006284.1, and LINC01602). The low-risk group had a better OS than the high-risk group (P < 0.0001), and the areas under the receiver operating characteristic curve (AUCs) of 3- and 5-year OS were 0.712 and 0.674, respectively. Then, we evaluated the signature in a clinical validation set which were collected from the Affiliated Hospital of Jiangnan University. Compared with the low-risk group, patients' OS were found to be significantly worse in the high-risk group (P = 0.0057). The AUCs of 3- and 5-year OS were 0.701 and 0.694, respectively. Finally, we constructed an lncRNA-microRNA (miRNA)-messenger RNA (mRNA) competing endogenous RNA (ceRNA) network to explore the potential function of three differentially expressed lncRNAs (DElncRNAs). The Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis indicated that these DElncRNAs were involved with several cancer-related pathways. In summary, our data provide evidence that the three-lncRNA signature could serve as an independent biomarker to predict prognosis in CRC. This study will also suggest that these three lncRNAs potentially participate in the progression of CRC.
Collapse
Affiliation(s)
- Yuhang Liu
- Wuxi Cancer Institute, Affiliated Hospital of Jiangnan University, Wuxi, China
- Laboratory of Cancer Epigenetics, Wuxi School of Medicine, Jiangnan University, Wuxi, China
| | - Bingxin Liu
- Wuxi Cancer Institute, Affiliated Hospital of Jiangnan University, Wuxi, China
- Laboratory of Cancer Epigenetics, Wuxi School of Medicine, Jiangnan University, Wuxi, China
| | - Guoying Jin
- Wuxi Cancer Institute, Affiliated Hospital of Jiangnan University, Wuxi, China
- Laboratory of Cancer Epigenetics, Wuxi School of Medicine, Jiangnan University, Wuxi, China
| | - Jia Zhang
- Wuxi Cancer Institute, Affiliated Hospital of Jiangnan University, Wuxi, China
- Laboratory of Cancer Epigenetics, Wuxi School of Medicine, Jiangnan University, Wuxi, China
| | - Xue Wang
- Laboratory of Cancer Epigenetics, Wuxi School of Medicine, Jiangnan University, Wuxi, China
| | - Yuyang Feng
- Laboratory of Cancer Epigenetics, Wuxi School of Medicine, Jiangnan University, Wuxi, China
| | - Zehua Bian
- Wuxi Cancer Institute, Affiliated Hospital of Jiangnan University, Wuxi, China
- Laboratory of Cancer Epigenetics, Wuxi School of Medicine, Jiangnan University, Wuxi, China
| | - Bojian Fei
- Department of Surgical Oncology, Affiliated Hospital of Jiangnan University, Wuxi, China
| | - Yuan Yin
- Wuxi Cancer Institute, Affiliated Hospital of Jiangnan University, Wuxi, China
- Laboratory of Cancer Epigenetics, Wuxi School of Medicine, Jiangnan University, Wuxi, China
| | - Zhaohui Huang
- Wuxi Cancer Institute, Affiliated Hospital of Jiangnan University, Wuxi, China
- Laboratory of Cancer Epigenetics, Wuxi School of Medicine, Jiangnan University, Wuxi, China
| |
Collapse
|
70
|
Williams SM, An JY, Edson J, Watts M, Murigneux V, Whitehouse AJO, Jackson CJ, Bellgrove MA, Cristino AS, Claudianos C. An integrative analysis of non-coding regulatory DNA variations associated with autism spectrum disorder. Mol Psychiatry 2019; 24:1707-1719. [PMID: 29703944 DOI: 10.1038/s41380-018-0049-x] [Citation(s) in RCA: 51] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/15/2016] [Revised: 01/16/2018] [Accepted: 02/19/2018] [Indexed: 01/09/2023]
Abstract
A number of genetic studies have identified rare protein-coding DNA variations associated with autism spectrum disorder (ASD), a neurodevelopmental disorder with significant genetic etiology and heterogeneity. In contrast, the contributions of functional, regulatory genetic variations that occur in the extensive non-protein-coding regions of the genome remain poorly understood. Here we developed a genome-wide analysis to identify the rare single nucleotide variants (SNVs) that occur in non-coding regions and determined the regulatory function and evolutionary conservation of these variants. Using publicly available datasets and computational predictions, we identified SNVs within putative regulatory regions in promoters, transcription factor binding sites, and microRNA genes and their target sites. Overall, we found that the regulatory variants in ASD cases were enriched in ASD-risk genes and genes involved in fetal neurodevelopment. As with previously reported coding mutations, we found an enrichment of the regulatory variants associated with dysregulation of neurodevelopmental and synaptic signaling pathways. Among these were several rare inherited SNVs found in the mature sequence of microRNAs predicted to affect the regulation of ASD-risk genes. We show a paternally inherited miR-873-5p variant with altered binding affinity for several risk-genes including NRXN2 and CNTNAP2 putatively overlay maternally inherited loss-of-function coding variations in NRXN1 and CNTNAP2 to likely increase the genetic liability in an idiopathic ASD case. Our analysis pipeline provides a new resource for identifying loss-of-function regulatory DNA variations that may contribute to the genetic etiology of complex disorders.
Collapse
Affiliation(s)
- Sarah M Williams
- University of Queensland Diamantina Institute, University of Queensland, Brisbane, Australia.,Queensland Brain Institute, University of Queensland, Brisbane, Australia
| | - Joon Yong An
- Queensland Brain Institute, University of Queensland, Brisbane, Australia.,Department of Psychiatry, University of California San Francisco, San Francisco, USA.,Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, USA
| | - Janette Edson
- Queensland Brain Institute, University of Queensland, Brisbane, Australia
| | - Michelle Watts
- Queensland Brain Institute, University of Queensland, Brisbane, Australia
| | - Valentine Murigneux
- University of Queensland Diamantina Institute, University of Queensland, Brisbane, Australia
| | - Andrew J O Whitehouse
- Telethon Kids Institute, University of Western Australia, Perth, Australia.,Cooperative Research Centre for Living with Autism, Brisbane, Australia
| | - Colin J Jackson
- Research School of Chemistry, Australian National University, Canberra, Australia
| | - Mark A Bellgrove
- Monash Institute of Cognitive and Clinical Neuroscience, Monash University, Melbourne, Australia
| | - Alexandre S Cristino
- University of Queensland Diamantina Institute, University of Queensland, Brisbane, Australia.
| | - Charles Claudianos
- Queensland Brain Institute, University of Queensland, Brisbane, Australia. .,Centre for Mental Health Research CMHR, Australian National University, Canberra, Australia.
| |
Collapse
|
71
|
Lee H, Zhang Z, Krause HM. Long Noncoding RNAs and Repetitive Elements: Junk or Intimate Evolutionary Partners? Trends Genet 2019; 35:892-902. [PMID: 31662190 DOI: 10.1016/j.tig.2019.09.006] [Citation(s) in RCA: 104] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2019] [Revised: 08/22/2019] [Accepted: 09/13/2019] [Indexed: 12/27/2022]
Abstract
Our recent ability to sequence entire genomes, along with all of their transcribed RNAs, has led to the surprising finding that only ∼1% of the human genome is used to encode proteins. This finding has led to vigorous debate over the functional importance of the transcribed but untranslated portions of the genome. Currently, scientists tend to assume coding genes are functional until proven not to be, while the opposite is true for noncoding genes. This review takes a new look at the evidence for and against widespread noncoding gene functionality. We focus in particular on long noncoding RNA (noncoding RNAs longer than 200 nucleotides) genes and their 'junk' associates, transposable elements, and satellite repeats. Taken together, the suggestion put forward is that more of this junk DNA may be functional than nonfunctional and that noncoding RNAs and transposable elements act symbiotically to drive evolution.
Collapse
Affiliation(s)
- Hyunmin Lee
- Donnelly Centre, University of Toronto, Toronto, ON, Canada; Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| | - Zhaolei Zhang
- Donnelly Centre, University of Toronto, Toronto, ON, Canada; Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada; Department of Computer Science, University of Toronto, Toronto, ON, Canada
| | - Henry M Krause
- Donnelly Centre, University of Toronto, Toronto, ON, Canada; Department of Computer Science, University of Toronto, Toronto, ON, Canada.
| |
Collapse
|
72
|
Fragoza R, Das J, Wierbowski SD, Liang J, Tran TN, Liang S, Beltran JF, Rivera-Erick CA, Ye K, Wang TY, Yao L, Mort M, Stenson PD, Cooper DN, Wei X, Keinan A, Schimenti JC, Clark AG, Yu H. Extensive disruption of protein interactions by genetic variants across the allele frequency spectrum in human populations. Nat Commun 2019; 10:4141. [PMID: 31515488 PMCID: PMC6742646 DOI: 10.1038/s41467-019-11959-3] [Citation(s) in RCA: 51] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2019] [Accepted: 08/06/2019] [Indexed: 12/19/2022] Open
Abstract
Each human genome carries tens of thousands of coding variants. The extent to which this variation is functional and the mechanisms by which they exert their influence remains largely unexplored. To address this gap, we leverage the ExAC database of 60,706 human exomes to investigate experimentally the impact of 2009 missense single nucleotide variants (SNVs) across 2185 protein-protein interactions, generating interaction profiles for 4797 SNV-interaction pairs, of which 421 SNVs segregate at > 1% allele frequency in human populations. We find that interaction-disruptive SNVs are prevalent at both rare and common allele frequencies. Furthermore, these results suggest that 10.5% of missense variants carried per individual are disruptive, a higher proportion than previously reported; this indicates that each individual's genetic makeup may be significantly more complex than expected. Finally, we demonstrate that candidate disease-associated mutations can be identified through shared interaction perturbations between variants of interest and known disease mutations.
Collapse
Affiliation(s)
- Robert Fragoza
- Department of Computational Biology, Cornell University, Ithaca, NY, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, 14853, USA
| | - Jishnu Das
- Ragon Institute of MGH, MIT and Harvard, Cambridge, MA, 02139, USA
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
| | - Shayne D Wierbowski
- Department of Computational Biology, Cornell University, Ithaca, NY, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, 14853, USA
| | - Jin Liang
- Department of Computational Biology, Cornell University, Ithaca, NY, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, 14853, USA
| | - Tina N Tran
- Department of Biomedical Science, Cornell University, Ithaca, NY, 14853, USA
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, 14853, USA
| | - Siqi Liang
- Department of Computational Biology, Cornell University, Ithaca, NY, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, 14853, USA
| | - Juan F Beltran
- Department of Computational Biology, Cornell University, Ithaca, NY, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, 14853, USA
| | - Christen A Rivera-Erick
- Department of Computational Biology, Cornell University, Ithaca, NY, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, 14853, USA
| | - Kaixiong Ye
- Department of Computational Biology, Cornell University, Ithaca, NY, 14853, USA
| | - Ting-Yi Wang
- Department of Computational Biology, Cornell University, Ithaca, NY, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, 14853, USA
| | - Li Yao
- Department of Computational Biology, Cornell University, Ithaca, NY, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, 14853, USA
| | - Matthew Mort
- Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff, CF14 4XN, UK
| | - Peter D Stenson
- Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff, CF14 4XN, UK
| | - David N Cooper
- Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff, CF14 4XN, UK
| | - Xiaomu Wei
- Department of Computational Biology, Cornell University, Ithaca, NY, 14853, USA
| | - Alon Keinan
- Department of Computational Biology, Cornell University, Ithaca, NY, 14853, USA
| | - John C Schimenti
- Department of Biomedical Science, Cornell University, Ithaca, NY, 14853, USA
| | - Andrew G Clark
- Department of Computational Biology, Cornell University, Ithaca, NY, 14853, USA
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, 14853, USA
| | - Haiyuan Yu
- Department of Computational Biology, Cornell University, Ithaca, NY, 14853, USA.
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, 14853, USA.
| |
Collapse
|
73
|
Lou S, Cotter KA, Li T, Liang J, Mohsen H, Liu J, Zhang J, Cohen S, Xu J, Yu H, Rubin MA, Gerstein M. GRAM: A GeneRAlized Model to predict the molecular effect of a non-coding variant in a cell-type specific manner. PLoS Genet 2019; 15:e1007860. [PMID: 31469829 PMCID: PMC6742416 DOI: 10.1371/journal.pgen.1007860] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2018] [Revised: 09/12/2019] [Accepted: 07/22/2019] [Indexed: 12/19/2022] Open
Abstract
There has been much effort to prioritize genomic variants with respect to their impact on "function". However, function is often not precisely defined: sometimes it is the disease association of a variant; on other occasions, it reflects a molecular effect on transcription or epigenetics. Here, we coupled multiple genomic predictors to build GRAM, a GeneRAlized Model, to predict a well-defined experimental target: the expression-modulating effect of a non-coding variant on its associated gene, in a transferable, cell-specific manner. Firstly, we performed feature engineering: using LASSO, a regularized linear model, we found transcription factor (TF) binding most predictive, especially for TFs that are hubs in the regulatory network; in contrast, evolutionary conservation, a popular feature in many other variant-impact predictors, has almost no contribution. Moreover, TF binding inferred from in vitro SELEX is as effective as that from in vivo ChIP-Seq. Second, we implemented GRAM integrating only SELEX features and expression profiles; thus, the program combines a universal regulatory score with an easily obtainable modifier reflecting the particular cell type. We benchmarked GRAM on large-scale MPRA datasets, achieving AUROC scores of 0.72 in GM12878 and 0.66 in a multi-cell line dataset. We then evaluated the performance of GRAM on targeted regions using luciferase assays in the MCF7 and K562 cell lines. We noted that changing the insertion position of the construct relative to the reporter gene gave very different results, highlighting the importance of carefully defining the exact prediction target of the model. Finally, we illustrated the utility of GRAM in fine-mapping causal variants and developed a practical software pipeline to carry this out. In particular, we demonstrated in specific examples how the pipeline could pinpoint variants that directly modulate gene expression within a larger linkage-disequilibrium block associated with a phenotype of interest (e.g., for an eQTL).
Collapse
Affiliation(s)
- Shaoke Lou
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, United States of America
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut, United States of America
| | - Kellie A. Cotter
- Department for BioMedical Research, University of Bern, CH, Bern, Switzerland
| | - Tianxiao Li
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, United States of America
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut, United States of America
| | - Jin Liang
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, United States of America
| | - Hussein Mohsen
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, United States of America
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut, United States of America
- Program in the History of Science and Medicine, Yale University, New Haven, Connecticut, United States of America
| | - Jason Liu
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, United States of America
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut, United States of America
| | - Jing Zhang
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, United States of America
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut, United States of America
| | - Sandra Cohen
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, Cornell University, New York, New York, United States of America
| | - Jinrui Xu
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, United States of America
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut, United States of America
| | - Haiyuan Yu
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York, United States of America
- Department of Computational Biology, Cornell University, Ithaca, New York, United States of America
| | - Mark A. Rubin
- Department for BioMedical Research, University of Bern, CH, Bern, Switzerland
- Weill Cornell Medicine, New York, United States of America
| | - Mark Gerstein
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, United States of America
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut, United States of America
| |
Collapse
|
74
|
Wang B, Yan C, Lou S, Emani P, Li B, Xu M, Kong X, Meyerson W, Yang YT, Lee D, Gerstein M. Building a Hybrid Physical-Statistical Classifier for Predicting the Effect of Variants Related to Protein-Drug Interactions. Structure 2019; 27:1469-1481.e3. [PMID: 31279629 DOI: 10.1016/j.str.2019.06.001] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2018] [Revised: 02/14/2019] [Accepted: 06/03/2019] [Indexed: 11/17/2022]
Abstract
A key issue in drug design is how population variation affects drug efficacy by altering binding affinity (BA) in different individuals, an essential consideration for government regulators. Ideally, we would like to evaluate the BA perturbations of millions of single-nucleotide variants (SNVs). However, only hundreds of protein-drug complexes with SNVs have experimentally characterized BAs, constituting too small a gold standard for straightforward statistical model training. Thus, we take a hybrid approach: using physically based calculations to bootstrap the parameterization of a full model. In particular, we do 3D structure-based docking on ∼10,000 SNVs modifying known protein-drug complexes to construct a pseudo gold standard. Then we use this augmented set of BAs to train a statistical model combining structure, ligand and sequence features and illustrate how it can be applied to millions of SNVs. Finally, we show that our model has good cross-validated performance (97% AUROC) and can also be validated by orthogonal ligand-binding data.
Collapse
Affiliation(s)
- Bo Wang
- Department of Chemistry, Yale University, New Haven, CT 06520, USA
| | - Chengfei Yan
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
| | - Shaoke Lou
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
| | - Prashant Emani
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
| | - Bian Li
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
| | - Min Xu
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
| | - Xiangmeng Kong
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
| | - William Meyerson
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA; Yale School of Medicine, Yale University, New Haven, CT 06520, USA
| | - Yucheng T Yang
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
| | - Donghoon Lee
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA
| | - Mark Gerstein
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA; Department of Computer Science, Yale University, New Haven, CT 06520, USA.
| |
Collapse
|
75
|
Deveau P, Colmet Daage L, Oldridge D, Bernard V, Bellini A, Chicard M, Clement N, Lapouble E, Combaret V, Boland A, Meyer V, Deleuze JF, Janoueix-Lerosey I, Barillot E, Delattre O, Maris JM, Schleiermacher G, Boeva V. QuantumClone: clonal assessment of functional mutations in cancer based on a genotype-aware method for clonal reconstruction. Bioinformatics 2019; 34:1808-1816. [PMID: 29342233 PMCID: PMC5972665 DOI: 10.1093/bioinformatics/bty016] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2017] [Accepted: 01/10/2018] [Indexed: 01/13/2023] Open
Abstract
Motivation In cancer, clonal evolution is assessed based on information coming from single nucleotide variants and copy number alterations. Nonetheless, existing methods often fail to accurately combine information from both sources to truthfully reconstruct clonal populations in a given tumor sample or in a set of tumor samples coming from the same patient. Moreover, previously published methods detect clones from a single set of variants. As a result, compromises have to be done between stringent variant filtering [reducing dispersion in variant allele frequency estimates (VAFs)] and using all biologically relevant variants. Results We present a framework for defining cancer clones using most reliable variants of high depth of coverage and assigning functional mutations to the detected clones. The key element of our framework is QuantumClone, a method for variant clustering into clones based on VAFs, genotypes of corresponding regions and information about tumor purity. We validated QuantumClone and our framework on simulated data. We then applied our framework to whole genome sequencing data for 19 neuroblastoma trios each including constitutional, diagnosis and relapse samples. We confirmed an enrichment of damaging variants within such pathways as MAPK (mitogen-activated protein kinases), neuritogenesis, epithelial-mesenchymal transition, cell survival and DNA repair. Most pathways had more damaging variants in the expanding clones compared to shrinking ones, which can be explained by the increased total number of variants between these two populations. Functional mutational rate varied for ancestral clones and clones shrinking or expanding upon treatment, suggesting changes in clone selection mechanisms at different time points of tumor evolution. Availability and implementation Source code and binaries of the QuantumClone R package are freely available for download at https://CRAN.R-project.org/package=QuantumClone. Contact gudrun.schleiermacher@curie.fr or valentina.boeva@inserm.fr. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Paul Deveau
- Institut Curie, PSL Research University, Mines Paris Tech, INSERM U900, Paris, France
- Département de Recherche Translationnelle, Institut Curie, PSL Research University, INSERM U830, Laboratoire RTOP (Recherche Translationelle en Oncologie Pédiatrique), SIREDO Oncology Center (Care, Innovation and research for children and AYA with cancer), Paris, France
- University of Paris-Sud, Orsay, France
| | - Leo Colmet Daage
- Département de Recherche Translationnelle, Institut Curie, PSL Research University, INSERM U830, Laboratoire RTOP (Recherche Translationelle en Oncologie Pédiatrique), SIREDO Oncology Center (Care, Innovation and research for children and AYA with cancer), Paris, France
| | - Derek Oldridge
- Division of Oncology, Children’s Hospital of Philadelphia, Philadelphia, PA, USA
- Center for Childhood Cancer Research Children’s Hospital of Philadelphia, Philadelphia, PA, USA
- Department of Pediatrics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, USA
| | - Virginie Bernard
- Institut Curie, PSL Research University, NGS platform ICGex, Paris, France
| | - Angela Bellini
- Département de Recherche Translationnelle, Institut Curie, PSL Research University, INSERM U830, Laboratoire RTOP (Recherche Translationelle en Oncologie Pédiatrique), SIREDO Oncology Center (Care, Innovation and research for children and AYA with cancer), Paris, France
| | - Mathieu Chicard
- Département de Recherche Translationnelle, Institut Curie, PSL Research University, INSERM U830, Laboratoire RTOP (Recherche Translationelle en Oncologie Pédiatrique), SIREDO Oncology Center (Care, Innovation and research for children and AYA with cancer), Paris, France
| | - Nathalie Clement
- Département de Recherche Translationnelle, Institut Curie, PSL Research University, INSERM U830, Laboratoire RTOP (Recherche Translationelle en Oncologie Pédiatrique), SIREDO Oncology Center (Care, Innovation and research for children and AYA with cancer), Paris, France
| | - Eve Lapouble
- Unité de Génétique Somatique, Institut Curie, PSL Research University, Paris, France
| | - Valerie Combaret
- Centre Léon-Bérard Laboratoire de Recherche Translationnelle, Lyon, France
| | - Anne Boland
- Centre National de Recherche en Génomique Humaine (CNRGH), Institut de biologie François Jacob, CEA, Evry, France
| | - Vincent Meyer
- Centre National de Recherche en Génomique Humaine (CNRGH), Institut de biologie François Jacob, CEA, Evry, France
| | - Jean-Francois Deleuze
- Centre National de Recherche en Génomique Humaine (CNRGH), Institut de biologie François Jacob, CEA, Evry, France
| | - Isabelle Janoueix-Lerosey
- Institut Curie, PSL Research University, INSERM U830, SIREDO Oncology Center (Care, Innovation and research for children and AYA with cancer), Equipe labellisée Ligue Nationale contre le cancer, Paris, France
| | - Emmanuel Barillot
- Institut Curie, PSL Research University, Mines Paris Tech, INSERM U900, Paris, France
| | - Olivier Delattre
- Institut Curie, PSL Research University, INSERM U830, SIREDO Oncology Center (Care, Innovation and research for children and AYA with cancer), Equipe labellisée Ligue Nationale contre le cancer, Paris, France
| | - John M Maris
- Division of Oncology, Children’s Hospital of Philadelphia, Philadelphia, PA, USA
- Center for Childhood Cancer Research Children’s Hospital of Philadelphia, Philadelphia, PA, USA
- Department of Pediatrics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, USA
| | - Gudrun Schleiermacher
- Département de Recherche Translationnelle, Institut Curie, PSL Research University, INSERM U830, Laboratoire RTOP (Recherche Translationelle en Oncologie Pédiatrique), SIREDO Oncology Center (Care, Innovation and research for children and AYA with cancer), Paris, France
- Département de Pédiatrie, Institut Curie, PSL Research University, Paris, France
- To whom correspondence should be addressed. or
| | - Valentina Boeva
- Institut Curie, PSL Research University, Mines Paris Tech, INSERM U900, Paris, France
- Institut Cochin, INSERM U1016, CNRS UMR 8104, Université Paris Descartes UMR-S1016, Paris, France
- To whom correspondence should be addressed. or
| |
Collapse
|
76
|
Enard D, Petrov DA. Evidence that RNA Viruses Drove Adaptive Introgression between Neanderthals and Modern Humans. Cell 2019; 175:360-371.e13. [PMID: 30290142 DOI: 10.1016/j.cell.2018.08.034] [Citation(s) in RCA: 119] [Impact Index Per Article: 19.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2018] [Revised: 07/04/2018] [Accepted: 08/16/2018] [Indexed: 01/01/2023]
Abstract
Neanderthals and modern humans interbred at least twice in the past 100,000 years. While there is evidence that most introgressed DNA segments from Neanderthals to modern humans were removed by purifying selection, less is known about the adaptive nature of introgressed sequences that were retained. We hypothesized that interbreeding between Neanderthals and modern humans led to (1) the exposure of each species to novel viruses and (2) the exchange of adaptive alleles that provided resistance against these viruses. Here, we find that long, frequent-and more likely adaptive-segments of Neanderthal ancestry in modern humans are enriched for proteins that interact with viruses (VIPs). We found that VIPs that interact specifically with RNA viruses were more likely to belong to introgressed segments in modern Europeans. Our results show that retained segments of Neanderthal ancestry can be used to detect ancient epidemics.
Collapse
Affiliation(s)
- David Enard
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ, USA.
| | - Dmitri A Petrov
- Department of Biology, Stanford University, Stanford, CA, USA
| |
Collapse
|
77
|
Zhang Y, Liao G, Bai J, Zhang X, Xu L, Deng C, Yan M, Xie A, Luo T, Long Z, Xiao Y, Li X. Identifying Cancer Driver lncRNAs Bridged by Functional Effectors through Integrating Multi-omics Data in Human Cancers. MOLECULAR THERAPY. NUCLEIC ACIDS 2019; 17:362-373. [PMID: 31302496 PMCID: PMC6626872 DOI: 10.1016/j.omtn.2019.05.030] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/15/2019] [Revised: 04/23/2019] [Accepted: 05/15/2019] [Indexed: 01/18/2023]
Abstract
The accumulation of somatic driver mutations in the human genome enables cells to gradually acquire a growth advantage and contributes to tumor development. Great efforts on protein-coding cancer drivers have yielded fruitful discoveries and clinical applications. However, investigations on cancer drivers in non-coding regions, especially long non-coding RNAs (lncRNAs), are extremely scarce due to the limitation of functional understanding. Thus, to identify driver lncRNAs integrating multi-omics data in human cancers, we proposed a computational framework, DriverLncNet, which dissected the functional impact of somatic copy number alteration (CNA) of lncRNAs on regulatory networks and captured key functional effectors in dys-regulatory networks. Applying it to 5 cancer types from The Cancer Genome Atlas (TCGA), we portrayed the landscape of 117 driver lncRNAs and revealed their associated cancer hallmarks through their functional effectors. Moreover, lncRNA RP11-571M6.8 was detected to be highly associated with immunotherapeutic targets (PD-1, PD-L1, and CTLA-4) and regulatory T cell infiltration level and their markers (IL2RA and FCGR2B) in glioblastoma multiforme, highlighting its immunosuppressive function. Meanwhile, a high expression of RP11-1020A11.1 in bladder carcinoma was predictive of poor survival independent of clinical characteristics, and CTD-2256P15.2 in lung adenocarcinoma responded to the sensitivity of methyl ethyl ketone (MEK) inhibitors. In summary, this study provided a framework to decipher the mechanisms of tumorigenesis from driver lncRNA level, established a new landscape of driver lncRNAs in human cancers, and offered potential clinical implications for precision oncology.
Collapse
Affiliation(s)
- Yong Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150086, China
| | - Gaoming Liao
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150086, China
| | - Jing Bai
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150086, China
| | - Xinxin Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150086, China
| | - Liwen Xu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150086, China
| | - Chunyu Deng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150086, China
| | - Min Yan
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150086, China
| | - Aimin Xie
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150086, China
| | - Tao Luo
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150086, China
| | - Zhilin Long
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150086, China
| | - Yun Xiao
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150086, China; Key Laboratory of Cardiovascular Medicine Research, Harbin Medical University, Ministry of Education, Harbin, Heilongjiang 150086, China.
| | - Xia Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150086, China; Key Laboratory of Cardiovascular Medicine Research, Harbin Medical University, Ministry of Education, Harbin, Heilongjiang 150086, China.
| |
Collapse
|
78
|
Liu EM, Martinez-Fundichely A, Diaz BJ, Aronson B, Cuykendall T, MacKay M, Dhingra P, Wong EWP, Chi P, Apostolou E, Sanjana NE, Khurana E. Identification of Cancer Drivers at CTCF Insulators in 1,962 Whole Genomes. Cell Syst 2019; 8:446-455.e8. [PMID: 31078526 PMCID: PMC6917527 DOI: 10.1016/j.cels.2019.04.001] [Citation(s) in RCA: 52] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2017] [Revised: 11/20/2018] [Accepted: 04/02/2019] [Indexed: 12/15/2022]
Abstract
Recent studies have shown that mutations at non-coding elements, such as promoters and enhancers, can act as cancer drivers. However, an important class of non-coding elements, namely CTCF insulators, has been overlooked in the previous driver analyses. We used insulator annotations from CTCF and cohesin ChIA-PET and analyzed somatic mutations in 1,962 whole genomes from 21 cancer types. Using the heterogeneous patterns of transcription-factor-motif disruption, functional impact, and recurrence of mutations, we developed a computational method that revealed 21 insulators showing signals of positive selection. In particular, mutations in an insulator in multiple cancer types, including 16% of melanoma samples, are associated with TGFB1 up-regulation. Using CRISPR-Cas9, we find that alterations at two of the most frequently mutated regions in this insulator increase cell growth by 40%-50%, supporting the role of this boundary element as a cancer driver. Thus, our study reveals several CTCF insulators as putative cancer drivers.
Collapse
Affiliation(s)
- Eric Minwei Liu
- Meyer Cancer Center, Weill Cornell Medicine, New York, NY 10065, USA; Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY 10065, USA; Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY 10021, USA
| | - Alexander Martinez-Fundichely
- Meyer Cancer Center, Weill Cornell Medicine, New York, NY 10065, USA; Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY 10065, USA; Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY 10021, USA
| | - Bianca Jay Diaz
- New York Genome Center, New York, NY 10013, USA; Department of Biology, New York University, New York, NY 10003, USA
| | - Boaz Aronson
- Meyer Cancer Center, Weill Cornell Medicine, New York, NY 10065, USA; Department of Medicine, Weill Cornell Medicine, New York, NY 10021, USA
| | - Tawny Cuykendall
- Meyer Cancer Center, Weill Cornell Medicine, New York, NY 10065, USA; Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY 10065, USA; Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY 10021, USA
| | - Matthew MacKay
- Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY 10021, USA
| | - Priyanka Dhingra
- Meyer Cancer Center, Weill Cornell Medicine, New York, NY 10065, USA; Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY 10065, USA; Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY 10021, USA
| | - Elissa W P Wong
- Human Oncology and Pathogenesis Program, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Ping Chi
- Department of Medicine, Weill Cornell Medicine, New York, NY 10021, USA; Human Oncology and Pathogenesis Program, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA; Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Effie Apostolou
- Meyer Cancer Center, Weill Cornell Medicine, New York, NY 10065, USA; Department of Medicine, Weill Cornell Medicine, New York, NY 10021, USA
| | - Neville E Sanjana
- New York Genome Center, New York, NY 10013, USA; Department of Biology, New York University, New York, NY 10003, USA
| | - Ekta Khurana
- Meyer Cancer Center, Weill Cornell Medicine, New York, NY 10065, USA; Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY 10065, USA; Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY 10021, USA; Caryl and Israel Englander Institute for Precision Medicine, New York Presbyterian Hospital, Weill Cornell Medicine, New York, NY 10065, USA.
| |
Collapse
|
79
|
Song J, Peng W, Wang F. A random walk-based method to identify driver genes by integrating the subcellular localization and variation frequency into bipartite graph. BMC Bioinformatics 2019; 20:238. [PMID: 31088372 PMCID: PMC6518800 DOI: 10.1186/s12859-019-2847-9] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2019] [Accepted: 04/24/2019] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND Cancer as a worldwide problem is driven by genomic alterations. With the advent of high-throughput sequencing technology, a huge amount of genomic data generates at every second which offer many valuable cancer information and meanwhile throw a big challenge to those investigators. As the major characteristic of cancer is heterogeneity and most of alterations are supposed to be useless passenger mutations that make no contribution to the cancer progress. Hence, how to dig out driver genes that have effect on a selective growth advantage in tumor cells from those tremendously and noisily data is still an urgent task. RESULTS Considering previous network-based method ignoring some important biological properties of driver genes and the low reliability of gene interactive network, we proposed a random walk method named as Subdyquency that integrates the information of subcellular localization, variation frequency and its interaction with other dysregulated genes to improve the prediction accuracy of driver genes. We applied our model to three different cancers: lung, prostate and breast cancer. The results show our model can not only identify the well-known important driver genes but also prioritize the rare unknown driver genes. Besides, compared with other existing methods, our method can improve the precision, recall and fscore to a higher level for most of cancer types. CONCLUSIONS The final results imply that driver genes are those prone to have higher variation frequency and impact more dysregulated genes in the common significant compartment. AVAILABILITY The source code can be obtained at https://github.com/weiba/Subdyquency .
Collapse
Affiliation(s)
- Junrong Song
- Faculty of Management and Economics/Computer center/Faculty of Information Engineering and Automation/Technology Application Key Lab of Yunnan Province, Kunming University of Science and Technology, Lianhua Road, 650050, Kunming, People's Republic of China
| | - Wei Peng
- Faculty of Management and Economics/Computer center/Faculty of Information Engineering and Automation/Technology Application Key Lab of Yunnan Province, Kunming University of Science and Technology, Lianhua Road, 650050, Kunming, People's Republic of China.
| | - Feng Wang
- Faculty of Management and Economics/Computer center/Faculty of Information Engineering and Automation/Technology Application Key Lab of Yunnan Province, Kunming University of Science and Technology, Lianhua Road, 650050, Kunming, People's Republic of China
| |
Collapse
|
80
|
Vergara-Lope A, Ennis S, Vorechovsky I, Pengelly RJ, Collins A. Heterogeneity in the extent of linkage disequilibrium among exonic, intronic, non-coding RNA and intergenic chromosome regions. Eur J Hum Genet 2019; 27:1436-1444. [PMID: 31053778 DOI: 10.1038/s41431-019-0419-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2018] [Revised: 03/04/2019] [Accepted: 04/16/2019] [Indexed: 11/09/2022] Open
Abstract
Whole-genome sequence data enable construction of high-resolution linkage disequilibrium (LD) maps revealing the LD structure of functional elements within genic and subgenic sequences. The Malecot-Morton model defines LD map distances in linkage disequilibrium units (LDUs), analogous to the centimorgan scale of linkage maps. For whole-genome sequence-derived LD maps, we introduce the ratio of corresponding map lengths kilobases/LDU to describe the extent of LD within genome components. The extent of LD is highly variable across the genome ranging from ~38 kb for intergenic sequences to ~858 kb for centromeric regions. LD is ~16% more extensive in genic, compared with intergenic sequences, reflecting relatively increased selection and/or reduced recombination in genes. The LD profile across 18,268 autosomal genes reveals reduced extent of LD, consistent with elevated recombination, in exonic regions near the 5' end of genes but more extensive LD, compared with intronic sequences, across more centrally located exons. Genes classified as essential and genes linked to Mendelian phenotypes show more extensive LD compared with genes associated with complex traits, perhaps reflecting differences in selective pressure. Significant differences between exonic, intronic and intergenic components demonstrate that fine-scale LD structure provides important insights into genome function, which cannot be revealed by LD analysis of much lower resolution array-based genotyping and conventional linkage maps.
Collapse
Affiliation(s)
- Alejandra Vergara-Lope
- Human Genetics, Faculty of Medicine, University of Southampton, Duthie Building (808), Southampton General Hospital, Tremona Road, Southampton, SO16 6YD, UK
| | - Sarah Ennis
- Human Genetics, Faculty of Medicine, University of Southampton, Duthie Building (808), Southampton General Hospital, Tremona Road, Southampton, SO16 6YD, UK
| | - Igor Vorechovsky
- Human Genetics, Faculty of Medicine, University of Southampton, Duthie Building (808), Southampton General Hospital, Tremona Road, Southampton, SO16 6YD, UK
| | - Reuben J Pengelly
- Human Genetics, Faculty of Medicine, University of Southampton, Duthie Building (808), Southampton General Hospital, Tremona Road, Southampton, SO16 6YD, UK
| | - Andrew Collins
- Human Genetics, Faculty of Medicine, University of Southampton, Duthie Building (808), Southampton General Hospital, Tremona Road, Southampton, SO16 6YD, UK.
| |
Collapse
|
81
|
Zhao Y, Schaafsma E, Cheng C. Applications of ENCODE data to Systematic Analyses via Data Integration. ACTA ACUST UNITED AC 2019; 11:57-64. [PMID: 31011690 DOI: 10.1016/j.coisb.2018.08.010] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Large-scale genomic data have been utilized to generate unprecedented biological findings and new hypotheses. To delineate functional elements in the human genome, the Encyclopedia of DNA Elements (ENCODE) project has generated an enormous amount of genomic data, yielding around 7,000 data profiles in different cell and tissue types. In this article, we reviewed the systematic analyses that have integrated ENCODE data with other data sources to reveal new biological insights, ranging from human genome annotation to the identification of new candidate drugs. These analyses demonstrate the critical impact of ENCODE data on basic biology and translational research.
Collapse
Affiliation(s)
- Yanding Zhao
- Department of Biomedical Data Science, The Geisel School of Medicine at Dartmouth College, One Medical Center Dr., Dartmouth-Hitchcock Medical Center, Lebanon, NH, United States, 03756.,Department of Molecular and Systems Biology, The Geisel School of Medicine at Dartmouth College, One Medical Center Dr., Dartmouth-Hitchcock Medical Center, Lebanon, NH, United States, 03756
| | - Evelien Schaafsma
- Department of Biomedical Data Science, The Geisel School of Medicine at Dartmouth College, One Medical Center Dr., Dartmouth-Hitchcock Medical Center, Lebanon, NH, United States, 03756.,Department of Molecular and Systems Biology, The Geisel School of Medicine at Dartmouth College, One Medical Center Dr., Dartmouth-Hitchcock Medical Center, Lebanon, NH, United States, 03756
| | - Chao Cheng
- Department of Biomedical Data Science, The Geisel School of Medicine at Dartmouth College, One Medical Center Dr., Dartmouth-Hitchcock Medical Center, Lebanon, NH, United States, 03756.,Department of Molecular and Systems Biology, The Geisel School of Medicine at Dartmouth College, One Medical Center Dr., Dartmouth-Hitchcock Medical Center, Lebanon, NH, United States, 03756.,Norris Cotton Cancer Center, The Geisel School of Medicine at Dartmouth College, One Medical Center Dr., Dartmouth-Hitchcock Medical Center, Lebanon, NH, United States, 03756
| |
Collapse
|
82
|
Gugnoni M, Ciarrocchi A. Long Noncoding RNA and Epithelial Mesenchymal Transition in Cancer. Int J Mol Sci 2019; 20:ijms20081924. [PMID: 31003545 PMCID: PMC6515529 DOI: 10.3390/ijms20081924] [Citation(s) in RCA: 113] [Impact Index Per Article: 18.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2019] [Revised: 04/12/2019] [Accepted: 04/15/2019] [Indexed: 12/22/2022] Open
Abstract
Epithelial-mesenchymal transition (EMT) is a multistep process that allows epithelial cells to acquire mesenchymal properties. Fundamental in the early stages of embryonic development, this process is aberrantly activated in aggressive cancerous cells to gain motility and invasion capacity, thus promoting metastatic phenotypes. For this reason, EMT is a central topic in cancer research and its regulation by a plethora of mechanisms has been reported. Recently, genomic sequencing and functional genomic studies deepened our knowledge on the fundamental regulatory role of noncoding DNA. A large part of the genome is transcribed in an impressive number of noncoding RNAs. Among these, long noncoding RNAs (lncRNAs) have been reported to control several biological processes affecting gene expression at multiple levels from transcription to protein localization and stability. Up to now, more than 8000 lncRNAs were discovered as selectively expressed in cancer cells. Their elevated number and high expression specificity candidate these molecules as a valuable source of biomarkers and potential therapeutic targets. Rising evidence currently highlights a relevant function of lncRNAs on EMT regulation defining a new layer of involvement of these molecules in cancer biology. In this review we aim to summarize the findings on the role of lncRNAs on EMT regulation and to discuss their prospective potential value as biomarkers and therapeutic targets in cancer.
Collapse
Affiliation(s)
- Mila Gugnoni
- Laboratory of Translational Research, Azienda Unità Sanitaria Locale-IRCCS di Reggio Emilia, 42122 Reggio Emilia, Italy.
| | - Alessia Ciarrocchi
- Laboratory of Translational Research, Azienda Unità Sanitaria Locale-IRCCS di Reggio Emilia, 42122 Reggio Emilia, Italy.
| |
Collapse
|
83
|
Ma Y, Wei P. FunSPU: A versatile and adaptive multiple functional annotation-based association test of whole-genome sequencing data. PLoS Genet 2019; 15:e1008081. [PMID: 31034468 PMCID: PMC6508749 DOI: 10.1371/journal.pgen.1008081] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2018] [Revised: 05/09/2019] [Accepted: 03/11/2019] [Indexed: 11/19/2022] Open
Abstract
Despite ongoing large-scale population-based whole-genome sequencing (WGS) projects such as the NIH NHLBI TOPMed program and the NHGRI Genome Sequencing Program, WGS-based association analysis of complex traits remains a tremendous challenge due to the large number of rare variants, many of which are non-trait-associated neutral variants. External biological knowledge, such as functional annotations based on the ENCODE, Epigenomics Roadmap and GTEx projects, may be helpful in distinguishing causal rare variants from neutral ones; however, each functional annotation can only provide certain aspects of the biological functions. Our knowledge for selecting informative annotations a priori is limited, and incorporating non-informative annotations will introduce noise and lose power. We propose FunSPU, a versatile and adaptive test that incorporates multiple biological annotations and is adaptive at both the annotation and variant levels and thus maintains high power even in the presence of noninformative annotations. In addition to extensive simulations, we illustrate our proposed test using the TWINSUK cohort (n = 1,752) of UK10K WGS data based on six functional annotations: CADD, RegulomeDB, FunSeq, Funseq2, GERP++, and GenoSkyline. We identified genome-wide significant genetic loci on chromosome 19 near gene TOMM40 and APOC4-APOC2 associated with low-density lipoprotein (LDL), which are replicated in the UK10K ALSPAC cohort (n = 1,497). These replicated LDL-associated loci were missed by existing rare variant association tests that either ignore external biological information or rely on a single source of biological knowledge. We have implemented the proposed test in an R package "FunSPU".
Collapse
Affiliation(s)
- Yiding Ma
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, Texas, United States of America
- Department of Biostatistics and Data Science, School of Public Health, The University of Texas Health Science Center, Houston, Texas, United States of America
| | - Peng Wei
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, Texas, United States of America
| |
Collapse
|
84
|
Ho EYK, Cao Q, Gu M, Chan RWL, Wu Q, Gerstein M, Yip KY. Shaping the nebulous enhancer in the era of high-throughput assays and genome editing. Brief Bioinform 2019; 21:836-850. [PMID: 30895290 DOI: 10.1093/bib/bbz030] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2018] [Revised: 02/15/2019] [Accepted: 02/26/2019] [Indexed: 01/22/2023] Open
Abstract
Since the 1st discovery of transcriptional enhancers in 1981, their textbook definition has remained largely unchanged in the past 37 years. With the emergence of high-throughput assays and genome editing, which are switching the paradigm from bottom-up discovery and testing of individual enhancers to top-down profiling of enhancer activities genome-wide, it has become increasingly evidenced that this classical definition has left substantial gray areas in different aspects. Here we survey a representative set of recent research articles and report the definitions of enhancers they have adopted. The results reveal that a wide spectrum of definitions is used usually without the definition stated explicitly, which could lead to difficulties in data interpretation and downstream analyses. Based on these findings, we discuss the practical implications and suggestions for future studies.
Collapse
Affiliation(s)
| | - Qin Cao
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong
| | - Mengting Gu
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut, USA
| | - Ricky Wai-Lun Chan
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong
| | - Qiong Wu
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong.,School of Biomedical Sciences, The Chinese University of Hong Kong, Hong Kong
| | - Mark Gerstein
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut, USA.,Program in Computational Biology and Bioinformatics.,Department of Computer Science, Yale University, New Haven, Connecticut, USA
| | - Kevin Y Yip
- Department of Biomedical Engineering.,Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong.,Hong Kong Bioinformatics Centre.,CUHK-BGI Innovation Institute of Trans-omics.,Hong Kong Institute of Diabetes and Obesity, The Chinese University of Hong Kong, Hong Kong
| |
Collapse
|
85
|
Positive selection in Europeans and East-Asians at the ABCA12 gene. Sci Rep 2019; 9:4843. [PMID: 30890716 PMCID: PMC6424970 DOI: 10.1038/s41598-019-40360-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2018] [Accepted: 02/07/2019] [Indexed: 11/17/2022] Open
Abstract
Natural selection acts on genetic variants by increasing the frequency of alleles responsible for a cellular function that is favorable in a certain environment. In a previous genome-wide scan for positive selection in contemporary humans, we identified a signal of positive selection in European and Asians at the genetic variant rs10180970. The variant is located in the second intron of the ABCA12 gene, which is implicated in the lipid barrier formation and down-regulated by UVB radiation. We studied the signal of selection in the genomic region surrounding rs10180970 in a larger dataset that includes DNA sequences from ancient samples. We also investigated the functional consequences of gene expression of the alleles of rs10180970 and another genetic variant in its proximity in healthy volunteers exposed to similar UV radiation. We confirmed the selection signal and refine its location that extends over 35 kb and includes the first intron, the first two exons and the transcription starting site of ABCA12. We found no obvious effect of rs10180970 alleles on ABCA12 gene expression. We reconstructed the trajectory of the T allele over the last 80,000 years to discover that it was specific to H. sapiens and present in non-Africans 45,000 years ago.
Collapse
|
86
|
Rigau M, Juan D, Valencia A, Rico D. Intronic CNVs and gene expression variation in human populations. PLoS Genet 2019; 15:e1007902. [PMID: 30677042 PMCID: PMC6345438 DOI: 10.1371/journal.pgen.1007902] [Citation(s) in RCA: 59] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2018] [Accepted: 12/17/2018] [Indexed: 11/19/2022] Open
Abstract
Introns can be extraordinarily large and they account for the majority of the DNA sequence in human genes. However, little is known about their population patterns of structural variation and their functional implication. By combining the most extensive maps of CNVs in human populations, we have found that intronic losses are the most frequent copy number variants (CNVs) in protein-coding genes in human, with 12,986 intronic deletions, affecting 4,147 genes (including 1,154 essential genes and 1,638 disease-related genes). This intronic length variation results in dozens of genes showing extreme population variability in size, with 40 genes with 10 or more different sizes and up to 150 allelic sizes. Intronic losses are frequent in evolutionarily ancient genes that are highly conserved at the protein sequence level. This result contrasts with losses overlapping exons, which are observed less often than expected by chance and almost exclusively affect primate-specific genes. An integrated analysis of CNVs and RNA-seq data showed that intronic loss can be associated with significant differences in gene expression levels in the population (CNV-eQTLs). These intronic CNV-eQTLs regions are enriched for intronic enhancers and can be associated with expression differences of other genes showing long distance intron-promoter 3D interactions. Our data suggests that intronic structural variation of protein-coding genes makes an important contribution to the variability of gene expression and splicing in human populations.
Collapse
Affiliation(s)
- Maria Rigau
- Barcelona Supercomputing Center (BSC), Barcelona, Spain
| | - David Juan
- Institut de Biologia Evolutiva, Consejo Superior de Investigaciones Científicas–Universitat Pompeu Fabra, Parc de Recerca Biomèdica de Barcelona, Barcelona, Spain
| | - Alfonso Valencia
- Barcelona Supercomputing Center (BSC), Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
| | - Daniel Rico
- Institute of Cellular Medicine, Newcastle University, Newcastle upon Tyne, United Kingdom
| |
Collapse
|
87
|
Park JH, Woo YM, Youm EM, Hamad N, Won HH, Naka K, Park EJ, Park JH, Kim HJ, Kim SH, Kim HJ, Ahn JS, Sohn SK, Moon JH, Jung CW, Park S, Lipton JH, Kimura S, Kim JW, Kim DDH. HMGCLL1 is a predictive biomarker for deep molecular response to imatinib therapy in chronic myeloid leukemia. Leukemia 2018; 33:1439-1450. [PMID: 30555164 PMCID: PMC6756062 DOI: 10.1038/s41375-018-0321-8] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2018] [Revised: 09/27/2018] [Accepted: 10/16/2018] [Indexed: 12/13/2022]
Abstract
Achieving a deep molecular response (DMR) to tyrosine kinase inhibitor (TKI) therapy for chronic myeloid leukemia (CML) remains challenging and at present, there is no biomarker to predict DMR in this setting. Herein, we report that an HMGCLL1 genetic variant located in 6p12.1 can be used as a predictive genetic biomarker for intrinsic sensitivity to imatinib (IM) therapy. We measured DMR rate according to HMGCLL1 variant in a discovery set of CML patients (n = 201) and successfully replicated it in a validation set (n = 270). We also investigated the functional relevance of HMGCLL1 blockade with respect to response to TKI therapy and showed that small interfering RNA mediated blockade of HMGCLL1 isoform 3 results in significant decrease in viability of BCR-ABL1-positive cells including K562, CML-T1 or BaF3 cell lines with or without ABL1 kinase domain mutations such as T315I mutation. Decreased cell viability was also demonstrated in murine CML stem cells and human hematopoietic progenitor cells. RNA sequencing showed that blockade of HMGCLL1 was associated with G0/G1 arrest and the cell cycle. In summary, the HMGCLL1 gene polymorphism is a novel genetic biomarker for intrinsic sensitivity to IM therapy in CML patients that predicts DMR in this setting.
Collapse
Affiliation(s)
- Jong-Ho Park
- Department of Health Sciences and Technology, Samsung Advanced Institute for Health Sciences and Technology, Sungkyunkwan University, Seoul, Korea
| | - Young Min Woo
- Department of Health Sciences and Technology, Samsung Advanced Institute for Health Sciences and Technology, Sungkyunkwan University, Seoul, Korea
| | - Emilia Moonkyung Youm
- Department of Health Sciences and Technology, Samsung Advanced Institute for Health Sciences and Technology, Sungkyunkwan University, Seoul, Korea
| | - Nada Hamad
- Department of Haematology, St Vincent's Hospital, University of New South Wales, Sydney, Australia
| | - Hong-Hee Won
- Samsung Advanced Institute for Health Sciences and Technology, Sungkyunkwan University, Samsung Medical Center, Seoul, Korea
| | - Kazuhito Naka
- Department of Stem Cell Biology, Research Institute for Radiation Biology and Medicine, Hiroshima University, Hiroshima, Japan
| | - Eun-Ju Park
- Research Institute for Future Medicine, Samsung Medical Center, Seoul, Korea
| | - June-Hee Park
- Research Institute for Future Medicine, Samsung Medical Center, Seoul, Korea
| | - Hee-Jin Kim
- Department of Laboratory Medicine and Genetics, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Korea
| | - Sun-Hee Kim
- Department of Laboratory Medicine and Genetics, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Korea
| | - Hyeoung-Joon Kim
- Department of Hematology-Oncology, Chonnam National University Hwasun Hospital, Hwasun, Korea
| | - Jae Sook Ahn
- Department of Hematology-Oncology, Chonnam National University Hwasun Hospital, Hwasun, Korea
| | - Sang Kyun Sohn
- Department of Hematology/Oncology, Kyungpook National University Hospital, Daegu, Korea
| | - Joon Ho Moon
- Department of Hematology/Oncology, Kyungpook National University Hospital, Daegu, Korea
| | - Chul Won Jung
- Department of Hematology/Oncology, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Korea
| | - Silvia Park
- Department of Hematology/Oncology, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Korea
| | - Jeffrey H Lipton
- Department of Medical Oncology & Hematology, Princess Margaret Cancer Centre, University Health Network, University of Toronto, Toronto, Canada
| | - Shinya Kimura
- Division of Hematology, Respiratory Medicine and Oncology, Department of Internal Medicine, Faculty of Medicine, Saga University, 5-1-1 Nabeshima, Saga, 849-8501, Japan
| | - Jong-Won Kim
- Department of Health Sciences and Technology, Samsung Advanced Institute for Health Sciences and Technology, Sungkyunkwan University, Seoul, Korea. .,Samsung Advanced Institute for Health Sciences and Technology, Sungkyunkwan University, Samsung Medical Center, Seoul, Korea. .,Department of Laboratory Medicine and Genetics, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Korea.
| | - Dennis Dong Hwan Kim
- Department of Medical Oncology & Hematology, Princess Margaret Cancer Centre, University Health Network, University of Toronto, Toronto, Canada
| |
Collapse
|
88
|
Wang X, Zheng Z, Cai Y, Chen T, Li C, Fu W, Jiang Y. CNVcaller: highly efficient and widely applicable software for detecting copy number variations in large populations. Gigascience 2018; 6:1-12. [PMID: 29220491 PMCID: PMC5751039 DOI: 10.1093/gigascience/gix115] [Citation(s) in RCA: 80] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2017] [Accepted: 11/17/2017] [Indexed: 01/01/2023] Open
Abstract
Background The increasing amount of sequencing data available for a wide variety of species can be theoretically used for detecting copy number variations (CNVs) at the population level. However, the growing sample sizes and the divergent complexity of nonhuman genomes challenge the efficiency and robustness of current human-oriented CNV detection methods. Results Here, we present CNVcaller, a read-depth method for discovering CNVs in population sequencing data. The computational speed of CNVcaller was 1-2 orders of magnitude faster than CNVnator and Genome STRiP for complex genomes with thousands of unmapped scaffolds. CNV detection of 232 goats required only 1.4 days on a single compute node. Additionally, the Mendelian consistency of sheep trios indicated that CNVcaller mitigated the influence of high proportions of gaps and misassembled duplications in the nonhuman reference genome assembly. Furthermore, multiple evaluations using real sheep and human data indicated that CNVcaller achieved the best accuracy and sensitivity for detecting duplications. Conclusions The fast generalized detection algorithms included in CNVcaller overcome prior computational barriers for detecting CNVs in large-scale sequencing data with complex genomic structures. Therefore, CNVcaller promotes population genetic analyses of functional CNVs in more species.
Collapse
Affiliation(s)
- Xihong Wang
- College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Zhuqing Zheng
- College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Yudong Cai
- College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Ting Chen
- College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Chao Li
- College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Weiwei Fu
- College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Yu Jiang
- College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
| |
Collapse
|
89
|
Deng Y, Luo S, Zhang X, Zou C, Yuan H, Liao G, Xu L, Deng C, Lan Y, Zhao T, Gao X, Xiao Y, Li X. A pan-cancer atlas of cancer hallmark-associated candidate driver lncRNAs. Mol Oncol 2018; 12:1980-2005. [PMID: 30216655 PMCID: PMC6210054 DOI: 10.1002/1878-0261.12381] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2018] [Revised: 07/21/2018] [Accepted: 09/03/2018] [Indexed: 12/12/2022] Open
Abstract
Substantial cancer genome sequencing efforts have discovered many important driver genes contributing to tumorigenesis. However, very little is known about the genetic alterations of long non‐coding RNAs (lncRNAs) in cancer. Thus, there is a need for systematic surveys of driver lncRNAs. Through integrative analysis of 5918 tumors across 11 cancer types, we revealed that lncRNAs have undergone dramatic genomic alterations, many of which are mutually exclusive with well‐known cancer genes. Using the hypothesis of functional redundancy of mutual exclusivity, we developed a computational framework to identify driver lncRNAs associated with different cancer hallmarks. Applying it to pan‐cancer data, we identified 378 candidate driver lncRNAs whose genomic features highly resemble the known cancer driver genes (e.g. high conservation and early replication). We further validated the candidate driver lncRNAs involved in ‘Tissue Invasion and Metastasis’ in lung adenocarcinoma and breast cancer, and also highlighted their potential roles in improving clinical outcomes. In summary, we have generated a comprehensive landscape of cancer candidate driver lncRNAs that could act as a starting point for future functional explorations, as well as the identification of biomarkers and lncRNA‐based target therapy.
Collapse
Affiliation(s)
- Yulan Deng
- College of Bioinformatics Science and Technology, Harbin Medical University, China
| | - Shangyi Luo
- College of Bioinformatics Science and Technology, Harbin Medical University, China
| | - Xinxin Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, China
| | - Chaoxia Zou
- Department of Biochemistry and Molecular Biology, Harbin Medical University, China
| | - Huating Yuan
- College of Bioinformatics Science and Technology, Harbin Medical University, China
| | - Gaoming Liao
- College of Bioinformatics Science and Technology, Harbin Medical University, China
| | - Liwen Xu
- College of Bioinformatics Science and Technology, Harbin Medical University, China
| | - Chunyu Deng
- College of Bioinformatics Science and Technology, Harbin Medical University, China
| | - Yujia Lan
- College of Bioinformatics Science and Technology, Harbin Medical University, China
| | - Tingting Zhao
- Department of Neurology, The First Affiliated Hospital of Harbin Medical University, China
| | - Xu Gao
- Department of Biochemistry and Molecular Biology, Harbin Medical University, China
| | - Yun Xiao
- College of Bioinformatics Science and Technology, Harbin Medical University, China.,Key Laboratory of Cardiovascular Medicine Research, Harbin Medical University, Ministry of Education, China
| | - Xia Li
- College of Bioinformatics Science and Technology, Harbin Medical University, China.,Key Laboratory of Cardiovascular Medicine Research, Harbin Medical University, Ministry of Education, China
| |
Collapse
|
90
|
Yang C, Stueve TR, Yan C, Rhie SK, Mullen DJ, Luo J, Zhou B, Borok Z, Marconett CN, Offringa IA. Positional integration of lung adenocarcinoma susceptibility loci with primary human alveolar epithelial cell epigenomes. Epigenomics 2018; 10:1167-1187. [PMID: 30212242 PMCID: PMC6391636 DOI: 10.2217/epi-2018-0003] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2018] [Accepted: 05/10/2018] [Indexed: 01/12/2023] Open
Abstract
AIM To identify functional lung adenocarcinoma (LUAD) risk SNPs. MATERIALS & METHODS Eighteen validated LUAD risk SNPs (p ≤ 5 × 10-8) and 930 SNPs in high linkage disequilibrium (r2 > 0.5) were integrated with epigenomic information from primary human alveolar epithelial cells. Enhancer-associated SNPs likely affecting transcription factor-binding sites were predicted. Three SNPs were functionally investigated using luciferase assays, expression quantitative trait loci and cancer-specific expression. RESULTS Forty-seven SNPs mapped to putative enhancers; 11 located to open chromatin. Of these, seven altered predicted transcription factor-binding motifs. Rs6942067 showed allele-specific luciferase expression and expression quantitative trait loci analysis indicates that it influences expression of DCBLD1, a gene that encodes an unknown membrane protein and is overexpressed in LUAD. CONCLUSION Integration of candidate LUAD risk SNPS with epigenomic marks from normal alveolar epithelium identified numerous candidate functional LUAD risk SNPs including rs6942067, which appears to affect DCBLD1 expression. Data deposition: Data are provided in GEO record GSE84273.
Collapse
Affiliation(s)
- Chenchen Yang
- Department of Surgery, University of Southern California, CA 90089, USA
- Department of Biochemistry & Molecular Medicine, University of Southern California, CA 90089, USA
- Norris Comprehensive Cancer Center, University of Southern California, CA 90089, USA
| | - Theresa Ryan Stueve
- Department of Surgery, University of Southern California, CA 90089, USA
- Department of Biochemistry & Molecular Medicine, University of Southern California, CA 90089, USA
- Norris Comprehensive Cancer Center, University of Southern California, CA 90089, USA
- Department of Preventive Medicine, University of Southern California, CA 90089, USA
| | - Chunli Yan
- Department of Surgery, University of Southern California, CA 90089, USA
- Department of Biochemistry & Molecular Medicine, University of Southern California, CA 90089, USA
- Norris Comprehensive Cancer Center, University of Southern California, CA 90089, USA
| | - Suhn K Rhie
- Department of Surgery, University of Southern California, CA 90089, USA
- Department of Biochemistry & Molecular Medicine, University of Southern California, CA 90089, USA
- Norris Comprehensive Cancer Center, University of Southern California, CA 90089, USA
| | - Daniel J Mullen
- Department of Surgery, University of Southern California, CA 90089, USA
- Department of Biochemistry & Molecular Medicine, University of Southern California, CA 90089, USA
- Norris Comprehensive Cancer Center, University of Southern California, CA 90089, USA
| | - Jiao Luo
- Department of Biochemistry & Molecular Medicine, University of Southern California, CA 90089, USA
- Department of Medicine, Division of Pulmonary & Critical Care & Sleep Medicine, University of Southern California, CA 90089, USA
| | - Beiyun Zhou
- Norris Comprehensive Cancer Center, University of Southern California, CA 90089, USA
- Department of Medicine, Division of Pulmonary & Critical Care & Sleep Medicine, University of Southern California, CA 90089, USA
- Hastings Center for Pulmonary Research, Keck School of Medicine, University of Southern California, CA 90089, USA
| | - Zea Borok
- Department of Biochemistry & Molecular Medicine, University of Southern California, CA 90089, USA
- Norris Comprehensive Cancer Center, University of Southern California, CA 90089, USA
- Department of Medicine, Division of Pulmonary & Critical Care & Sleep Medicine, University of Southern California, CA 90089, USA
- Hastings Center for Pulmonary Research, Keck School of Medicine, University of Southern California, CA 90089, USA
| | - Crystal N Marconett
- Department of Surgery, University of Southern California, CA 90089, USA
- Department of Biochemistry & Molecular Medicine, University of Southern California, CA 90089, USA
- Norris Comprehensive Cancer Center, University of Southern California, CA 90089, USA
| | - Ite A Offringa
- Department of Surgery, University of Southern California, CA 90089, USA
- Department of Biochemistry & Molecular Medicine, University of Southern California, CA 90089, USA
- Norris Comprehensive Cancer Center, University of Southern California, CA 90089, USA
| |
Collapse
|
91
|
Gorski MM, de Haan HG, Mancini I, Lotta LA, Bucciarelli P, Passamonti SM, Cairo A, Pappalardo E, van Hylckama Vlieg A, Martinelli I, Rosendaal FR, Peyvandi F. Next-generation DNA sequencing to identify novel genetic risk factors for cerebral vein thrombosis. Thromb Res 2018; 169:76-81. [DOI: 10.1016/j.thromres.2018.06.011] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2018] [Revised: 05/16/2018] [Accepted: 06/13/2018] [Indexed: 11/26/2022]
|
92
|
McGillivray P, Clarke D, Meyerson W, Zhang J, Lee D, Gu M, Kumar S, Zhou H, Gerstein M. Network Analysis as a Grand Unifier in Biomedical Data Science. Annu Rev Biomed Data Sci 2018. [DOI: 10.1146/annurev-biodatasci-080917-013444] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Biomedical data scientists study many types of networks, ranging from those formed by neurons to those created by molecular interactions. People often criticize these networks as uninterpretable diagrams termed hairballs; however, here we show that molecular biological networks can be interpreted in several straightforward ways. First, we can break down a network into smaller components, focusing on individual pathways and modules. Second, we can compute global statistics describing the network as a whole. Third, we can compare networks. These comparisons can be within the same context (e.g., between two gene regulatory networks) or cross-disciplinary (e.g., between regulatory networks and governmental hierarchies). The latter comparisons can transfer a formalism, such as that for Markov chains, from one context to another or relate our intuitions in a familiar setting (e.g., social networks) to the relatively unfamiliar molecular context. Finally, key aspects of molecular networks are dynamics and evolution, i.e., how they evolve over time and how genetic variants affect them. By studying the relationships between variants in networks, we can begin to interpret many common diseases, such as cancer and heart disease.
Collapse
Affiliation(s)
- Patrick McGillivray
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut 06520, USA
| | - Declan Clarke
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut 06520, USA
| | - William Meyerson
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut 06520, USA
| | - Jing Zhang
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut 06520, USA
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut 06520, USA
| | - Donghoon Lee
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut 06520, USA
| | - Mengting Gu
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut 06520, USA
- Department of Computer Science, Yale University, New Haven, Connecticut 06520, USA
| | - Sushant Kumar
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut 06520, USA
| | - Holly Zhou
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut 06520, USA
| | - Mark Gerstein
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut 06520, USA
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut 06520, USA
- Department of Computer Science, Yale University, New Haven, Connecticut 06520, USA
| |
Collapse
|
93
|
Mao P, Brown AJ, Esaki S, Lockwood S, Poon GMK, Smerdon MJ, Roberts SA, Wyrick JJ. ETS transcription factors induce a unique UV damage signature that drives recurrent mutagenesis in melanoma. Nat Commun 2018; 9:2626. [PMID: 29980679 PMCID: PMC6035183 DOI: 10.1038/s41467-018-05064-0] [Citation(s) in RCA: 97] [Impact Index Per Article: 13.9] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2018] [Accepted: 06/07/2018] [Indexed: 11/12/2022] Open
Abstract
Recurrent mutations are frequently associated with transcription factor (TF) binding sites (TFBS) in melanoma, but the mechanism driving mutagenesis at TFBS is unclear. Here, we use a method called CPD-seq to map the distribution of UV-induced cyclobutane pyrimidine dimers (CPDs) across the human genome at single nucleotide resolution. Our results indicate that CPD lesions are elevated at active TFBS, an effect that is primarily due to E26 transformation-specific (ETS) TFs. We show that ETS TFs induce a unique signature of CPD hotspots that are highly correlated with recurrent mutations in melanomas, despite high repair activity at these sites. ETS1 protein renders its DNA binding targets extremely susceptible to UV damage in vitro, due to binding-induced perturbations in the DNA structure that favor CPD formation. These findings define a mechanism responsible for recurrent mutations in melanoma and reveal that DNA binding by ETS TFs is inherently mutagenic in UV-exposed cells. Many factors contribute to mutation hotspots in cancer cells. Here the authors map UV damage at single-nucleotide resolution across the human genome and find that binding sites of ETS transcription factors are especially prone to forming UV lesions, leading to mutation hotspots in melanoma.
Collapse
Affiliation(s)
- Peng Mao
- School of Molecular Biosciences, Washington State University, Pullman, WA, 99164, USA
| | - Alexander J Brown
- School of Molecular Biosciences, Washington State University, Pullman, WA, 99164, USA
| | - Shingo Esaki
- Department of Chemistry, Georgia State University, Atlanta, GA, 30303, USA
| | - Svetlana Lockwood
- Paul G. Allen School for Global Animal Health, Washington State University, Pullman, WA, 99164, USA
| | - Gregory M K Poon
- Department of Chemistry, Georgia State University, Atlanta, GA, 30303, USA.,Center for Diagnostics and Therapeutics, Georgia State University, Atlanta, GA, 30303, USA
| | - Michael J Smerdon
- School of Molecular Biosciences, Washington State University, Pullman, WA, 99164, USA
| | - Steven A Roberts
- School of Molecular Biosciences, Washington State University, Pullman, WA, 99164, USA. .,Center for Reproductive Biology, Washington State University, Pullman, WA, 99164, USA.
| | - John J Wyrick
- School of Molecular Biosciences, Washington State University, Pullman, WA, 99164, USA. .,Center for Reproductive Biology, Washington State University, Pullman, WA, 99164, USA.
| |
Collapse
|
94
|
Brandler WM, Antaki D, Gujral M, Kleiber ML, Whitney J, Maile MS, Hong O, Chapman TR, Tan S, Tandon P, Pang T, Tang SC, Vaux KK, Yang Y, Harrington E, Juul S, Turner DJ, Thiruvahindrapuram B, Kaur G, Wang Z, Kingsmore SF, Gleeson JG, Bisson D, Kakaradov B, Telenti A, Venter JC, Corominas R, Toma C, Cormand B, Rueda I, Guijarro S, Messer KS, Nievergelt CM, Arranz MJ, Courchesne E, Pierce K, Muotri AR, Iakoucheva LM, Hervas A, Scherer SW, Corsello C, Sebat J. Paternally inherited cis-regulatory structural variants are associated with autism. Science 2018; 360:327-331. [PMID: 29674594 DOI: 10.1126/science.aan2261] [Citation(s) in RCA: 137] [Impact Index Per Article: 19.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2017] [Revised: 08/07/2017] [Accepted: 02/27/2018] [Indexed: 12/15/2022]
Abstract
The genetic basis of autism spectrum disorder (ASD) is known to consist of contributions from de novo mutations in variant-intolerant genes. We hypothesize that rare inherited structural variants in cis-regulatory elements (CRE-SVs) of these genes also contribute to ASD. We investigated this by assessing the evidence for natural selection and transmission distortion of CRE-SVs in whole genomes of 9274 subjects from 2600 families affected by ASD. In a discovery cohort of 829 families, structural variants were depleted within promoters and untranslated regions, and paternally inherited CRE-SVs were preferentially transmitted to affected offspring and not to their unaffected siblings. The association of paternal CRE-SVs was replicated in an independent sample of 1771 families. Our results suggest that rare inherited noncoding variants predispose children to ASD, with differing contributions from each parent.
Collapse
Affiliation(s)
- William M Brandler
- Beyster Center for Genomics of Psychiatric Diseases, University of California San Diego, La Jolla, CA 92093, USA.,Department of Psychiatry, University of California San Diego, La Jolla, CA 92093, USA.,Department of Cellular and Molecular Medicine and Pediatrics, University of California San Diego, La Jolla, CA 92093, USA.,Human Longevity, Inc., San Diego, CA 92121, USA
| | - Danny Antaki
- Beyster Center for Genomics of Psychiatric Diseases, University of California San Diego, La Jolla, CA 92093, USA.,Department of Psychiatry, University of California San Diego, La Jolla, CA 92093, USA.,Department of Cellular and Molecular Medicine and Pediatrics, University of California San Diego, La Jolla, CA 92093, USA.,Biomedical Sciences Graduate Program, University of California San Diego, La Jolla, CA 92093, USA
| | - Madhusudan Gujral
- Beyster Center for Genomics of Psychiatric Diseases, University of California San Diego, La Jolla, CA 92093, USA.,Department of Psychiatry, University of California San Diego, La Jolla, CA 92093, USA.,Department of Cellular and Molecular Medicine and Pediatrics, University of California San Diego, La Jolla, CA 92093, USA
| | - Morgan L Kleiber
- Beyster Center for Genomics of Psychiatric Diseases, University of California San Diego, La Jolla, CA 92093, USA.,Department of Psychiatry, University of California San Diego, La Jolla, CA 92093, USA.,Department of Cellular and Molecular Medicine and Pediatrics, University of California San Diego, La Jolla, CA 92093, USA
| | - Joe Whitney
- The Centre for Applied Genomics, Genetics, and Genome Biology, The Hospital for Sick Children, Toronto, Canada
| | - Michelle S Maile
- Beyster Center for Genomics of Psychiatric Diseases, University of California San Diego, La Jolla, CA 92093, USA.,Department of Psychiatry, University of California San Diego, La Jolla, CA 92093, USA.,Department of Cellular and Molecular Medicine and Pediatrics, University of California San Diego, La Jolla, CA 92093, USA
| | - Oanh Hong
- Beyster Center for Genomics of Psychiatric Diseases, University of California San Diego, La Jolla, CA 92093, USA.,Department of Psychiatry, University of California San Diego, La Jolla, CA 92093, USA.,Department of Cellular and Molecular Medicine and Pediatrics, University of California San Diego, La Jolla, CA 92093, USA
| | - Timothy R Chapman
- Beyster Center for Genomics of Psychiatric Diseases, University of California San Diego, La Jolla, CA 92093, USA.,Department of Psychiatry, University of California San Diego, La Jolla, CA 92093, USA.,Department of Cellular and Molecular Medicine and Pediatrics, University of California San Diego, La Jolla, CA 92093, USA
| | - Shirley Tan
- Beyster Center for Genomics of Psychiatric Diseases, University of California San Diego, La Jolla, CA 92093, USA.,Department of Psychiatry, University of California San Diego, La Jolla, CA 92093, USA.,Department of Cellular and Molecular Medicine and Pediatrics, University of California San Diego, La Jolla, CA 92093, USA
| | - Prateek Tandon
- Beyster Center for Genomics of Psychiatric Diseases, University of California San Diego, La Jolla, CA 92093, USA.,Department of Psychiatry, University of California San Diego, La Jolla, CA 92093, USA.,Department of Cellular and Molecular Medicine and Pediatrics, University of California San Diego, La Jolla, CA 92093, USA
| | - Timothy Pang
- Beyster Center for Genomics of Psychiatric Diseases, University of California San Diego, La Jolla, CA 92093, USA.,Rady Children's Hospital, San Diego, CA 92123, USA
| | - Shih C Tang
- Beyster Center for Genomics of Psychiatric Diseases, University of California San Diego, La Jolla, CA 92093, USA.,Rady Children's Hospital, San Diego, CA 92123, USA
| | - Keith K Vaux
- Department of Medicine, University of California San Diego, La Jolla, CA 92093, USA
| | - Yan Yang
- Oxford Nanopore Technologies, Inc., NY 10013, USA
| | | | - Sissel Juul
- Oxford Nanopore Technologies, Inc., NY 10013, USA
| | | | - Bhooma Thiruvahindrapuram
- The Centre for Applied Genomics, Genetics, and Genome Biology, The Hospital for Sick Children, Toronto, Canada
| | - Gaganjot Kaur
- The Centre for Applied Genomics, Genetics, and Genome Biology, The Hospital for Sick Children, Toronto, Canada
| | - Zhuozhi Wang
- The Centre for Applied Genomics, Genetics, and Genome Biology, The Hospital for Sick Children, Toronto, Canada
| | - Stephen F Kingsmore
- Rady Children's Institute for Genomic Medicine, Rady Children's Hospital, San Diego, CA 92123, USA
| | - Joseph G Gleeson
- Howard Hughes Medical Institute, Rady Children's Institute of Genomic Medicine, Department of Neurosciences, University of California San Diego, La Jolla, CA 92093, USA
| | | | | | | | - J Craig Venter
- Human Longevity, Inc., San Diego, CA 92121, USA.,J. Craig Venter Institute, La Jolla, CA 92037, USA
| | - Roser Corominas
- Genetics Unit, Department of Experimental and Health Sciences, Universitat Pompeu Fabra, Barcelona, Spain.,Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), Barcelona, Spain
| | - Claudio Toma
- Departament de Genètica, Microbiologia i Estadística, Facultat de Biologia, Universitat de Barcelona, Catalonia, Spain.,Neuroscience Research Australia, Sydney, Australia.,School of Medical Sciences, University of New South Wales, Sydney, Australia
| | - Bru Cormand
- Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), Barcelona, Spain.,Departament de Genètica, Microbiologia i Estadística, Facultat de Biologia, Universitat de Barcelona, Catalonia, Spain.,Institut de Biomedicina de la Universitat de Barcelona (IBUB), Catalonia, Spain.,Institut de Recerca Sant Joan de Déu (IR-SJD), Esplugues, Catalonia, Spain
| | - Isabel Rueda
- Department of Psychiatry, Hospital Sant Joan de Déu, Barcelona, Spain
| | - Silvina Guijarro
- Child and Adolescent Mental Health Unit, Hospital Universitari Mútua de Terrassa, Barcelona, Spain
| | - Karen S Messer
- Division of Biostatistics and Bioinformatics, Department of Family Medicine and Public Health, University of California, San Diego, La Jolla, CA 92093, USA
| | - Caroline M Nievergelt
- Department of Psychiatry, University of California San Diego, La Jolla, CA 92093, USA
| | - Maria J Arranz
- Research Laboratory Unit, Fundacio Docencia I Recerca Mutua Terrassa, Barcelona, Spain
| | - Eric Courchesne
- Autism Center of Excellence, Department of Neuroscience, University of California San Diego, La Jolla, CA 92093, USA
| | - Karen Pierce
- Autism Center of Excellence, Department of Neuroscience, University of California San Diego, La Jolla, CA 92093, USA
| | - Alysson R Muotri
- Department of Cellular and Molecular Medicine and Pediatrics, University of California San Diego, La Jolla, CA 92093, USA
| | - Lilia M Iakoucheva
- Department of Psychiatry, University of California San Diego, La Jolla, CA 92093, USA
| | - Amaia Hervas
- Child and Adolescent Mental Health Unit, Hospital Universitari Mútua de Terrassa, Barcelona, Spain
| | - Stephen W Scherer
- The Centre for Applied Genomics, Genetics, and Genome Biology, The Hospital for Sick Children, Toronto, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, Canada.,McLaughlin Centre, University of Toronto, Toronto, Canada
| | - Christina Corsello
- Department of Psychiatry, University of California San Diego, La Jolla, CA 92093, USA.,Rady Children's Hospital, San Diego, CA 92123, USA
| | - Jonathan Sebat
- Beyster Center for Genomics of Psychiatric Diseases, University of California San Diego, La Jolla, CA 92093, USA. .,Department of Psychiatry, University of California San Diego, La Jolla, CA 92093, USA.,Department of Cellular and Molecular Medicine and Pediatrics, University of California San Diego, La Jolla, CA 92093, USA
| |
Collapse
|
95
|
Li W, Wang M, Sun J, Wang Y, Jiang R. Gene co-opening network deciphers gene functional relationships. MOLECULAR BIOSYSTEMS 2018; 13:2428-2439. [PMID: 28976510 DOI: 10.1039/c7mb00430c] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Genome sequencing technology has generated a vast amount of genomic and epigenomic data, and has provided us a great opportunity to study gene functions on a global scale from an epigenomic view. In the last decade, network-based studies, such as those based on PPI networks and co-expression networks, have shown good performance in capturing functional relationships between genes. However, the functions of a gene and the mechanism of interaction of genes with each other to elucidate their functions are still not entirely clear. Here, we construct a gene co-opening network based on chromatin accessibility of genes. We show that genes related to a specific biological process or the same disease tend to be clustered in the co-opening network. This understanding allows us to detect functional clusters from the network and to predict new functions for genes. We further apply the network to prioritize disease genes for Psoriasis, and demonstrate the power of the joint analysis of the co-opening network and GWAS data in identifying disease genes. Taken together, the co-opening network provides a new viewpoint for the elucidation of gene associations and the interpretation of disease mechanisms.
Collapse
Affiliation(s)
- Wenran Li
- MOE Key Laboratory of Bioinformatics, Bioinformatics Division and Center for Synthetic & Systems Biology, TNLIST, Department of Automation, Tsinghua University, Beijing 100084, China.
| | | | | | | | | |
Collapse
|
96
|
Zhang P, Xia JH, Zhu J, Gao P, Tian YJ, Du M, Guo YC, Suleman S, Zhang Q, Kohli M, Tillmans LS, Thibodeau SN, French AJ, Cerhan JR, Wang LD, Wei GH, Wang L. High-throughput screening of prostate cancer risk loci by single nucleotide polymorphisms sequencing. Nat Commun 2018; 9:2022. [PMID: 29789573 PMCID: PMC5964124 DOI: 10.1038/s41467-018-04451-x] [Citation(s) in RCA: 60] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2017] [Accepted: 05/02/2018] [Indexed: 12/18/2022] Open
Abstract
Functional characterization of disease-causing variants at risk loci has been a significant challenge. Here we report a high-throughput single-nucleotide polymorphisms sequencing (SNPs-seq) technology to simultaneously screen hundreds to thousands of SNPs for their allele-dependent protein-binding differences. This technology takes advantage of higher retention rate of protein-bound DNA oligos in protein purification column to quantitatively sequence these SNP-containing oligos. We apply this technology to test prostate cancer-risk loci and observe differential allelic protein binding in a significant number of selected SNPs. We also test a unique application of self-transcribing active regulatory region sequencing (STARR-seq) in characterizing allele-dependent transcriptional regulation and provide detailed functional analysis at two risk loci (RGS17 and ASCL2). Together, we introduce a powerful high-throughput pipeline for large-scale screening of functional SNPs at disease risk loci. Functional characterization of disease-causing variants at risk loci in cancer is challenging. Here, in prostate cancer the authors report a pipeline for high-throughput single-nucleotide polymorphisms sequencing (SNPs-seq) for large scale screening of functional SNPs at disease risk loci.
Collapse
Affiliation(s)
- Peng Zhang
- Henan Key Laboratory for Esophageal Cancer Research, The First Affiliated Hospital of Zhengzhou University, 40 Daxue Road, 450052, Zhengzhou, Henan, China.,Department of Pathology, MCW Cancer Center, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI, 53226, USA
| | - Ji-Han Xia
- Biocenter Oulu, Faculty of Biochemistry and Molecular Medicine, University of Oulu, Aapistie 5 A, 90220, Oulu, Finland
| | - Jing Zhu
- Department of Pathology, MCW Cancer Center, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI, 53226, USA
| | - Ping Gao
- Biocenter Oulu, Faculty of Biochemistry and Molecular Medicine, University of Oulu, Aapistie 5 A, 90220, Oulu, Finland
| | - Yi-Jun Tian
- Department of Pathology, MCW Cancer Center, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI, 53226, USA
| | - Meijun Du
- Department of Pathology, MCW Cancer Center, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI, 53226, USA
| | - Yong-Chen Guo
- Department of Pathology, MCW Cancer Center, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI, 53226, USA
| | - Sufyan Suleman
- Biocenter Oulu, Faculty of Biochemistry and Molecular Medicine, University of Oulu, Aapistie 5 A, 90220, Oulu, Finland
| | - Qin Zhang
- Biocenter Oulu, Faculty of Biochemistry and Molecular Medicine, University of Oulu, Aapistie 5 A, 90220, Oulu, Finland
| | - Manish Kohli
- Department of Oncology, Mayo Clinic, 200 First Street SW, Rochester, MN, 55905, USA
| | - Lori S Tillmans
- Department of Laboratory Medicine and Pathology, Mayo Clinic, 200 First Street SW, Rochester, MN, 55905, USA
| | - Stephen N Thibodeau
- Department of Laboratory Medicine and Pathology, Mayo Clinic, 200 First Street SW, Rochester, MN, 55905, USA
| | - Amy J French
- Department of Laboratory Medicine and Pathology, Mayo Clinic, 200 First Street SW, Rochester, MN, 55905, USA
| | - James R Cerhan
- Department of Health Sciences Research, Mayo Clinic, 200 First Street SW, Rochester, MN, 55905, USA
| | - Li-Dong Wang
- Henan Key Laboratory for Esophageal Cancer Research, The First Affiliated Hospital of Zhengzhou University, 40 Daxue Road, 450052, Zhengzhou, Henan, China.
| | - Gong-Hong Wei
- Biocenter Oulu, Faculty of Biochemistry and Molecular Medicine, University of Oulu, Aapistie 5 A, 90220, Oulu, Finland.
| | - Liang Wang
- Department of Pathology, MCW Cancer Center, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI, 53226, USA.
| |
Collapse
|
97
|
Chen B, Wang J, Chen Y, Gu X, Feng X. The MDM2 rs937283 A > G variant significantly increases the risk of lung and gastric cancer in Chinese population. Int J Clin Oncol 2018; 23:867-876. [PMID: 29777315 DOI: 10.1007/s10147-018-1295-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2018] [Accepted: 05/14/2018] [Indexed: 12/16/2022]
Abstract
BACKGROUND Currently, the MDM2 promoter rs937283 A > G variant that is able to alter MDM2 gene expression has been widely studied to explore the association of MDM2 with cancer risk. In this report, we investigate the association of MDM2 rs937283 A > G variant with risk of lung cancer (LC) and gastric cancer (GC) in a Chinese population of Hubei province, which was followed by a meta-analysis. METHODS The genotyping of rs937283 was performed by polymerase chain reaction-restriction fragment length polymorphism and confirmed by sequencing. RESULTS The results of the present study showed that rs937283 was significantly associated with the risk of LC, and the factors of age, gender, smoking status and drinking status would affect such association. However, rs937283 was only associated with the risk of GC in male, smoking and drinking subgroups. The following meta-analysis demonstrated that rs937283 was associated with the overall cancer risk particularly in Chinese population, which reinforced our present finding. Moreover, the meta-analysis according to cancer types revealed that rs937283 was associated with retinoblastoma risk, but not squamous cell carcinoma risk. CONCLUSION Collectively, the MDM2 rs937283 A > G variant may be a valuable risk factor or diagnostic biomarker for Chinese cancer patients.
Collapse
Affiliation(s)
- Bifeng Chen
- Department of Biological Science and Technology, School of Chemistry, Chemical Engineering and Life Sciences, Wuhan University of Technology, Wuhan, China.
| | - Jieling Wang
- Department of Biological Science and Technology, School of Chemistry, Chemical Engineering and Life Sciences, Wuhan University of Technology, Wuhan, China
| | - Yucan Chen
- Department of Biological Science and Technology, School of Chemistry, Chemical Engineering and Life Sciences, Wuhan University of Technology, Wuhan, China
| | - Xiuli Gu
- Center of Reproductive Medicine, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China.,Wuhan Tongji Reproductive Medicine Hospital, Wuhan, China
| | - Xianhong Feng
- Clinical Laboratory, Wuhan Xinzhou District People's Hospital, Wuhan, China.
| |
Collapse
|
98
|
de Valles-Ibáñez G, Esteve-Solé A, Piquer M, González-Navarro EA, Hernandez-Rodriguez J, Laayouni H, González-Roca E, Plaza-Martin AM, Deyà-Martínez Á, Martín-Nalda A, Martínez-Gallo M, García-Prat M, Del Pino-Molina L, Cuscó I, Codina-Solà M, Batlle-Masó L, Solís-Moruno M, Marquès-Bonet T, Bosch E, López-Granados E, Aróstegui JI, Soler-Palacín P, Colobran R, Yagüe J, Alsina L, Juan M, Casals F. Evaluating the Genetics of Common Variable Immunodeficiency: Monogenetic Model and Beyond. Front Immunol 2018; 9:636. [PMID: 29867916 PMCID: PMC5960686 DOI: 10.3389/fimmu.2018.00636] [Citation(s) in RCA: 107] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2017] [Accepted: 03/14/2018] [Indexed: 12/16/2022] Open
Abstract
Common variable immunodeficiency (CVID) is the most frequent symptomatic primary immunodeficiency characterized by recurrent infections, hypogammaglobulinemia and poor response to vaccines. Its diagnosis is made based on clinical and immunological criteria, after exclusion of other diseases that can cause similar phenotypes. Currently, less than 20% of cases of CVID have a known underlying genetic cause. We have analyzed whole-exome sequencing and copy number variants data of 36 children and adolescents diagnosed with CVID and healthy relatives to estimate the proportion of monogenic cases. We have replicated an association of CVID to p.C104R in TNFRSF13B and reported the second case of homozygous patient to date. Our results also identify five causative genetic variants in LRBA, CTLA4, NFKB1, and PIK3R1, as well as other very likely causative variants in PRKCD, MAPK8, or DOCK8 among others. We experimentally validate the effect of the LRBA stop-gain mutation which abolishes protein production and downregulates the expression of CTLA4, and of the frameshift indel in CTLA4 producing expression downregulation of the protein. Our results indicate a monogenic origin of at least 15–24% of the CVID cases included in the study. The proportion of monogenic patients seems to be lower in CVID than in other PID that have also been analyzed by whole exome or targeted gene panels sequencing. Regardless of the exact proportion of CVID monogenic cases, other genetic models have to be considered for CVID. We propose that because of its prevalence and other features as intermediate penetrancies and phenotypic variation within families, CVID could fit with other more complex genetic scenarios. In particular, in this work, we explore the possibility of CVID being originated by an oligogenic model with the presence of heterozygous mutations in interacting proteins or by the accumulation of detrimental variants in particular immunological pathways, as well as perform association tests to detect association with rare genetic functional variation in the CVID cohort compared to healthy controls.
Collapse
Affiliation(s)
- Guillem de Valles-Ibáñez
- Institut de Biologia Evolutiva (UPF-CSIC), Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, Parc de Recerca Biomèdica de Barcelona, Barcelona, Spain
| | - Ana Esteve-Solé
- Allergy and Clinical Immunology Department, Hospital Sant Joan de Déu, Institut de Recerca Pediàtrica Hospital Sant Joan de Déu, Barcelona, Spain.,Functional Unit of Clinical Immunology Hospital Sant Joan de Déu-Hospital Clinic, Barcelona, Spain
| | - Mònica Piquer
- Allergy and Clinical Immunology Department, Hospital Sant Joan de Déu, Institut de Recerca Pediàtrica Hospital Sant Joan de Déu, Barcelona, Spain.,Functional Unit of Clinical Immunology Hospital Sant Joan de Déu-Hospital Clinic, Barcelona, Spain
| | - E Azucena González-Navarro
- Functional Unit of Clinical Immunology Hospital Sant Joan de Déu-Hospital Clinic, Barcelona, Spain.,Servei d'Immunologia, Centre de Diagnòstic Biomèdic, Hospital Clinic-IDIBAPS, Barcelona, Spain
| | - Jessica Hernandez-Rodriguez
- Institut de Biologia Evolutiva (UPF-CSIC), Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, Parc de Recerca Biomèdica de Barcelona, Barcelona, Spain
| | - Hafid Laayouni
- Institut de Biologia Evolutiva (UPF-CSIC), Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, Parc de Recerca Biomèdica de Barcelona, Barcelona, Spain.,Bioinformatics Studies, ESCI-UPF, Barcelona, Spain
| | - Eva González-Roca
- Functional Unit of Clinical Immunology Hospital Sant Joan de Déu-Hospital Clinic, Barcelona, Spain.,Servei d'Immunologia, Centre de Diagnòstic Biomèdic, Hospital Clinic-IDIBAPS, Barcelona, Spain
| | - Ana María Plaza-Martin
- Allergy and Clinical Immunology Department, Hospital Sant Joan de Déu, Institut de Recerca Pediàtrica Hospital Sant Joan de Déu, Barcelona, Spain.,Functional Unit of Clinical Immunology Hospital Sant Joan de Déu-Hospital Clinic, Barcelona, Spain
| | - Ángela Deyà-Martínez
- Allergy and Clinical Immunology Department, Hospital Sant Joan de Déu, Institut de Recerca Pediàtrica Hospital Sant Joan de Déu, Barcelona, Spain.,Functional Unit of Clinical Immunology Hospital Sant Joan de Déu-Hospital Clinic, Barcelona, Spain
| | - Andrea Martín-Nalda
- Pediatric Infectious Diseases and Immunodeficiencies Unit, Hospital Universitari Vall d'Hebron (HUVH), Vall d'Hebron Institut de Recerca (VHIR), Universitat Autònoma de Barcelona, Barcelona, Spain.,Jeffrey Modell Diagnostic and Research Center for Primary Immunodeficiencies, Barcelona, Spain
| | - Mónica Martínez-Gallo
- Jeffrey Modell Diagnostic and Research Center for Primary Immunodeficiencies, Barcelona, Spain.,Immunology Division, Department of Clinical and Molecular Genetics, Hospital Universitari Vall d'Hebron (HUVH), Vall d'Hebron Research Institute (VHIR), Barcelona, Spain.,Department of Cell Biology, Physiology and Immunology, Universitat Autònoma de Barcelona, Barcelona, Spain
| | - Marina García-Prat
- Pediatric Infectious Diseases and Immunodeficiencies Unit, Hospital Universitari Vall d'Hebron (HUVH), Vall d'Hebron Institut de Recerca (VHIR), Universitat Autònoma de Barcelona, Barcelona, Spain.,Jeffrey Modell Diagnostic and Research Center for Primary Immunodeficiencies, Barcelona, Spain
| | - Lucía Del Pino-Molina
- Clinical Immunology Department, University Hospital La Paz and Physiopathology of Lymphocytes in Immunodeficiencies Group, IdiPAZ Institute for Health Research, Madrid, Spain
| | - Ivón Cuscó
- Department of Experimental and Health Sciences, Universitat Pompeu Fabra, Barcelona, Spain.,Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBER-ER), Madrid, Spain
| | - Marta Codina-Solà
- Department of Experimental and Health Sciences, Universitat Pompeu Fabra, Barcelona, Spain.,Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBER-ER), Madrid, Spain
| | - Laura Batlle-Masó
- Institut de Biologia Evolutiva (UPF-CSIC), Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, Parc de Recerca Biomèdica de Barcelona, Barcelona, Spain.,Servei de Genòmica, Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, Parc de Recerca Biomèdica de Barcelona, Barcelona, Spain
| | - Manuel Solís-Moruno
- Institut de Biologia Evolutiva (UPF-CSIC), Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, Parc de Recerca Biomèdica de Barcelona, Barcelona, Spain.,Servei de Genòmica, Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, Parc de Recerca Biomèdica de Barcelona, Barcelona, Spain
| | - Tomàs Marquès-Bonet
- Institut de Biologia Evolutiva (UPF-CSIC), Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, Parc de Recerca Biomèdica de Barcelona, Barcelona, Spain.,Catalan Institution of Research and Advanced Studies (ICREA), Barcelona, Spain.,CNAG-CRG, Centre for Genomic Regulation, Barcelona Institute of Science and Technology (BIST), Barcelona, Spain
| | - Elena Bosch
- Institut de Biologia Evolutiva (UPF-CSIC), Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, Parc de Recerca Biomèdica de Barcelona, Barcelona, Spain
| | - Eduardo López-Granados
- Clinical Immunology Department, University Hospital La Paz and Physiopathology of Lymphocytes in Immunodeficiencies Group, IdiPAZ Institute for Health Research, Madrid, Spain
| | - Juan Ignacio Aróstegui
- Functional Unit of Clinical Immunology Hospital Sant Joan de Déu-Hospital Clinic, Barcelona, Spain.,Servei d'Immunologia, Centre de Diagnòstic Biomèdic, Hospital Clinic-IDIBAPS, Barcelona, Spain
| | - Pere Soler-Palacín
- Pediatric Infectious Diseases and Immunodeficiencies Unit, Hospital Universitari Vall d'Hebron (HUVH), Vall d'Hebron Institut de Recerca (VHIR), Universitat Autònoma de Barcelona, Barcelona, Spain.,Jeffrey Modell Diagnostic and Research Center for Primary Immunodeficiencies, Barcelona, Spain
| | - Roger Colobran
- Jeffrey Modell Diagnostic and Research Center for Primary Immunodeficiencies, Barcelona, Spain.,Immunology Division, Department of Clinical and Molecular Genetics, Hospital Universitari Vall d'Hebron (HUVH), Vall d'Hebron Research Institute (VHIR), Barcelona, Spain.,Department of Cell Biology, Physiology and Immunology, Universitat Autònoma de Barcelona, Barcelona, Spain
| | - Jordi Yagüe
- Functional Unit of Clinical Immunology Hospital Sant Joan de Déu-Hospital Clinic, Barcelona, Spain.,Servei d'Immunologia, Centre de Diagnòstic Biomèdic, Hospital Clinic-IDIBAPS, Barcelona, Spain
| | - Laia Alsina
- Allergy and Clinical Immunology Department, Hospital Sant Joan de Déu, Institut de Recerca Pediàtrica Hospital Sant Joan de Déu, Barcelona, Spain.,Functional Unit of Clinical Immunology Hospital Sant Joan de Déu-Hospital Clinic, Barcelona, Spain
| | - Manel Juan
- Functional Unit of Clinical Immunology Hospital Sant Joan de Déu-Hospital Clinic, Barcelona, Spain.,Servei d'Immunologia, Centre de Diagnòstic Biomèdic, Hospital Clinic-IDIBAPS, Barcelona, Spain
| | - Ferran Casals
- Servei de Genòmica, Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, Parc de Recerca Biomèdica de Barcelona, Barcelona, Spain
| |
Collapse
|
99
|
Valenzuela D, Norri T, Välimäki N, Pitkänen E, Mäkinen V. Towards pan-genome read alignment to improve variation calling. BMC Genomics 2018; 19:87. [PMID: 29764365 PMCID: PMC5954285 DOI: 10.1186/s12864-018-4465-8] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
Background Typical human genome differs from the reference genome at 4-5 million sites. This diversity is increasingly catalogued in repositories such as ExAC/gnomAD, consisting of >15,000 whole-genomes and >126,000 exome sequences from different individuals. Despite this enormous diversity, resequencing data workflows are still based on a single human reference genome. Identification and genotyping of genetic variants is typically carried out on short-read data aligned to a single reference, disregarding the underlying variation. Results We propose a new unified framework for variant calling with short-read data utilizing a representation of human genetic variation – a pan-genomic reference. We provide a modular pipeline that can be seamlessly incorporated into existing sequencing data analysis workflows. Our tool is open source and available online: https://gitlab.com/dvalenzu/PanVC. Conclusions Our experiments show that by replacing a standard human reference with a pan-genomic one we achieve an improvement in single-nucleotide variant calling accuracy and in short indel calling accuracy over the widely adopted Genome Analysis Toolkit (GATK) in difficult genomic regions.
Collapse
Affiliation(s)
- Daniel Valenzuela
- Department of Computer Science, Helsinki Institute for Information Technology HIIT, University of Helsinki, P.O. Box 68 (Gustaf Hällströmin katu 2b), Helsinki, 00014, Finland
| | - Tuukka Norri
- Department of Computer Science, Helsinki Institute for Information Technology HIIT, University of Helsinki, P.O. Box 68 (Gustaf Hällströmin katu 2b), Helsinki, 00014, Finland
| | - Niko Välimäki
- Department of Medical and Clinical Genetics, Genome-Scale Biology Program, University of Helsinki, Helsinki, Finland
| | - Esa Pitkänen
- European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany
| | - Veli Mäkinen
- Department of Computer Science, Helsinki Institute for Information Technology HIIT, University of Helsinki, P.O. Box 68 (Gustaf Hällströmin katu 2b), Helsinki, 00014, Finland.
| |
Collapse
|
100
|
Wang Z, Ng KS, Chen T, Kim TB, Wang F, Shaw K, Scott KL, Meric-Bernstam F, Mills GB, Chen K. Cancer driver mutation prediction through Bayesian integration of multi-omic data. PLoS One 2018; 13:e0196939. [PMID: 29738578 PMCID: PMC5940219 DOI: 10.1371/journal.pone.0196939] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2017] [Accepted: 04/23/2018] [Indexed: 01/23/2023] Open
Abstract
Identification of cancer driver mutations is critical for advancing cancer research and personalized medicine. Due to inter-tumor genetic heterogeneity, many driver mutations occur at low frequencies, which make it challenging to distinguish them from passenger mutations. Here, we show that a novel Bayesian hierarchical modeling approach, named rDriver can achieve enhanced prediction accuracy by identifying mutations that not only have high functional impact scores but also are associated with systemic variation in gene expression levels. In examining 3,080 tumor samples from 8 cancer types in The Cancer Genome Atlas, rDriver predicted 1,389 driver mutations. Compared with existing tools, rDriver identified more low frequency mutations associated with lineage specific functional properties, timing of occurrence and patient survival. Evaluation of rDriver predictions using engineered cell-line models resulted in a positive predictive value of 0.94 in PIK3CA genes. Our study highlights the importance of integrating multi-omic data in predicting cancer driver mutations and provides a statistically rigorous solution for cancer target discovery and development.
Collapse
Affiliation(s)
- Zixing Wang
- Department of Bioinformatics and Computational Biology, The University of Texas M.D. Anderson Cancer Center, Houston, Texas, United States of America
- Institute for Personalized Cancer Therapy, The University of Texas M.D. Anderson Cancer Center, Houston, Texas, United States of America
| | - Kwok-Shing Ng
- Institute for Personalized Cancer Therapy, The University of Texas M.D. Anderson Cancer Center, Houston, Texas, United States of America
| | - Tenghui Chen
- Department of Bioinformatics and Computational Biology, The University of Texas M.D. Anderson Cancer Center, Houston, Texas, United States of America
| | - Tae-Beom Kim
- Department of Bioinformatics and Computational Biology, The University of Texas M.D. Anderson Cancer Center, Houston, Texas, United States of America
| | - Fang Wang
- Department of Bioinformatics and Computational Biology, The University of Texas M.D. Anderson Cancer Center, Houston, Texas, United States of America
| | - Kenna Shaw
- Institute for Personalized Cancer Therapy, The University of Texas M.D. Anderson Cancer Center, Houston, Texas, United States of America
| | - Kenneth L. Scott
- Department of Human and Molecular Genetics, Baylor College of Medicine, Houston, Texas, United States of America
| | - Funda Meric-Bernstam
- Institute for Personalized Cancer Therapy, The University of Texas M.D. Anderson Cancer Center, Houston, Texas, United States of America
- Department of Investigational Cancer Therapy, The University of Texas M.D. Anderson Cancer Center, Houston, Texas, United States of America
| | - Gordon B. Mills
- Institute for Personalized Cancer Therapy, The University of Texas M.D. Anderson Cancer Center, Houston, Texas, United States of America
- Department of Systems Biology, The University of Texas M.D. Anderson Cancer Center, Houston, Texas, United States of America
| | - Ken Chen
- Department of Bioinformatics and Computational Biology, The University of Texas M.D. Anderson Cancer Center, Houston, Texas, United States of America
- Institute for Personalized Cancer Therapy, The University of Texas M.D. Anderson Cancer Center, Houston, Texas, United States of America
- * E-mail:
| |
Collapse
|