1
|
Dietlein F, Weghorn D, Taylor-Weiner A, Richters A, Reardon B, Liu D, Lander ES, Van Allen EM, Sunyaev SR. Identification of cancer driver genes based on nucleotide context. Nat Genet 2020; 52:208-218. [PMID: 32015527 PMCID: PMC7031046 DOI: 10.1038/s41588-019-0572-y] [Citation(s) in RCA: 147] [Impact Index Per Article: 29.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2019] [Accepted: 12/16/2019] [Indexed: 12/26/2022]
Abstract
Cancer genomes contain large numbers of somatic mutations but few of these mutations drive tumor development. Current approaches either identify driver genes on the basis of mutational recurrence or approximate the functional consequences of nonsynonymous mutations by using bioinformatic scores. Passenger mutations are enriched in characteristic nucleotide contexts, whereas driver mutations occur in functional positions, which are not necessarily surrounded by a particular nucleotide context. We observed that mutations in contexts that deviate from the characteristic contexts around passenger mutations provide a signal in favor of driver genes. We therefore developed a method that combines this feature with the signals traditionally used for driver-gene identification. We applied our method to whole-exome sequencing data from 11,873 tumor-normal pairs and identified 460 driver genes that clustered into 21 cancer-related pathways. Our study provides a resource of driver genes across 28 tumor types with additional driver genes identified according to mutations in unusual nucleotide contexts.
Collapse
Affiliation(s)
- Felix Dietlein
- Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA.
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, USA.
| | - Donate Weghorn
- Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Centre for Genomic Regulation, Barcelona, Spain
| | - Amaro Taylor-Weiner
- Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, USA
| | - André Richters
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, USA
- Koch Institute for Integrative Cancer Research, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Brendan Reardon
- Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, USA
| | - David Liu
- Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, USA
| | - Eric S Lander
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, USA
| | - Eliezer M Van Allen
- Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA.
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, USA.
| | - Shamil R Sunyaev
- Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA.
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
2
|
Liang Y, Jiang L, Zhong X, Hochwald SN, Wang Y, Huang L, Nie Q, Huang H, Xu JF. Discovery of Aberrant Alteration of Genome in Colorectal Cancer by Exome Sequencing. Am J Med Sci 2019; 358:340-349. [PMID: 31445671 DOI: 10.1016/j.amjms.2019.07.012] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2019] [Revised: 07/27/2019] [Accepted: 07/31/2019] [Indexed: 12/12/2022]
Abstract
BACKGROUND This study analyzed multiple parameters including somatic single nucleotide variations (SNVs), Insertion/Deletions, significantly mutated genes (SMGs), copy number variations and frequently altered pathways aims to discover novel aberrances in the tumorigenesis of colorectal cancer (CRC). MATERIALS AND METHODS Exome sequencing was performed on an Illumina platform to identify novel potential somatic variances in 34 paired tumor and adjacent normal tissues from 17 CRC patients. Results were compared with databases (dbSNP138, 1000 genomes SNP, Hapmap, Catalogue of Somatic Mutation of Cancer and ESP6500) and analyzed. MuSic software was used to identify SMGs. RESULTS In total, 1,637 somatic SNVs in 17 analyzed tumors were identified. Only 7 SNVs were shared by more than 1 tumor, suggesting that over 99% of the analyzed SNVs were independent events. Mutation of KRAS p. G12D and ZNF717 p. L39V were the most common SNVs. Moreover, 10 SMGs namely KRAS, TP53, SMAD4, ZNF717, FBXW7, APC, ZNF493, CDR1, the Armadillo repeat containing 4 (ARMC4) and sulfate-modifying factor 2 (SUMF2) were found. Among those, ZNF717, ZNF493, CDR1, ARMC4 and SUMF2 were novel frequent genes in CRC. For copy number variations analysis, gains in 10q25.3, 1p31.1, 1q44, 10q23.33, 11p15.4 and 20q13.33, and loss of 3q21.3 and 3q29 were frequent aberrations identified in our results. CONCLUSIONS We frequently found novel genes ZNF717, ZNF493, CDR1, ARMC4 and SUMF2 and gains in 10q25.3, which may be functional mutation in CRC. The high-frequency private events such as SNVs confirm the highly heterogeneous mutations found in CRCs. The mutated genes sites in different patients may vary significantly, which may also be more challenging for clinical treatment.
Collapse
Affiliation(s)
- Yuanzi Liang
- Department of Clinical Immunology, Institute of Clinical Laboratory Medicine, Guangdong Provincial Key Laboratory of Medical Molecular Diagnostics, Guangdong Medical University, Dongguan, Guangdong, China
| | | | - Xiaogang Zhong
- Department of Gastrointestinal Surgery, The People's Hospital of Guangxi Zhuang Autonomous Region, Nanning, Guangxi, China
| | - Steven N Hochwald
- Department of Surgical Oncology, Roswell Park Cancer Institute, Buffalo, New York
| | - Yongsi Wang
- Division of Genome Sequencing, Huayin Medical Laboratory, Guangzhou, Guangdong, China
| | - Lihe Huang
- Department of Laboratory Medicine, Debao County Hospital, Baise, Guangxi, China
| | - Qiumiao Nie
- Wilking Biotechnology Co., Ltd, Nanning, Guangxi, China
| | - Huayi Huang
- Department of Laboratory Medicine and; Department of Surgical Oncology, Roswell Park Cancer Institute, Buffalo, New York.
| | - Jun-Fa Xu
- Department of Clinical Immunology, Institute of Clinical Laboratory Medicine, Guangdong Provincial Key Laboratory of Medical Molecular Diagnostics, Guangdong Medical University, Dongguan, Guangdong, China.
| |
Collapse
|
3
|
Jensen KH, Izarzugaza JM, Juncker AS, Hansen RB, Hansen TF, Timshel P, Blondal T, Jensen TS, Rygaard-Hjalsted E, Mouritzen P, Thorsen M, Wernersson R, Nielsen HB, Jakobsen A, Brunak S, Sørensen FB. Analysis of a gene panel for targeted sequencing of colorectal cancer samples. Oncotarget 2018; 9:9043-9060. [PMID: 29507673 PMCID: PMC5823670 DOI: 10.18632/oncotarget.24138] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2017] [Accepted: 12/30/2017] [Indexed: 12/19/2022] Open
Abstract
Colorectal cancer (CRC) is a leading cause of death worldwide. Surgical intervention is a successful treatment for stage I patients, whereas other more advanced cases may require adjuvant chemotherapy. The selection of effective adjuvant treatments remains, however, challenging. Accurate patient stratification is necessary for the identification of the subset of patients likely responding to treatment, while sparing others from pernicious treatment. Targeted sequencing approaches may help in this regard, enabling rapid genetic investigation, and at the same time easily applicable in routine diagnosis. We propose a set of guidelines for the identification, including variant calling and filtering, of somatic mutations driving tumorigenesis in the absence of matched healthy tissue. We also discuss the inclusion criteria for the generation of our gene panel. Furthermore, we evaluate the prognostic impact of individual genes, using Cox regression models in the context of overall survival and disease-free survival. These analyses confirmed the role of commonly used biomarkers, and shed light on controversial genes such as CYP2C8. Applying those guidelines, we created a novel gene panel to investigate the onset and progression of CRC in 273 patients. Our comprehensive biomarker set includes 266 genes that may play a role in the progression through the different stages of the disease. Tracing the developmental state of the tumour, and its resistances, is instrumental in patient stratification and reliable decision making in precision clinical practice.
Collapse
Affiliation(s)
- Klaus Højgaard Jensen
- Department of Bio and Health Informatics, Technical University of Denmark, Kgs, Lyngby 2800, Denmark
- Intomics A/S, Kgs, Lyngby 2800, Denmark
| | - Jose M.G. Izarzugaza
- Department of Bio and Health Informatics, Technical University of Denmark, Kgs, Lyngby 2800, Denmark
| | | | | | | | - Pascal Timshel
- Department of Bio and Health Informatics, Technical University of Denmark, Kgs, Lyngby 2800, Denmark
| | | | | | | | | | | | | | - Henrik Bjørn Nielsen
- Department of Bio and Health Informatics, Technical University of Denmark, Kgs, Lyngby 2800, Denmark
| | | | - Søren Brunak
- Department of Bio and Health Informatics, Technical University of Denmark, Kgs, Lyngby 2800, Denmark
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen DK-2200, Denmark
| | - Flemming Brandt Sørensen
- Oncology Department, Vejle Hospital, Vejle 7100, Denmark
- Patologisk Institut, Aarhus Universitetshospital, Aarhus 8200, Denmark
| |
Collapse
|
4
|
Analysis of somatic mutations across the kinome reveals loss-of-function mutations in multiple cancer types. Sci Rep 2017; 7:6418. [PMID: 28743916 PMCID: PMC5527104 DOI: 10.1038/s41598-017-06366-x] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2016] [Accepted: 06/13/2017] [Indexed: 12/17/2022] Open
Abstract
In this study we use somatic cancer mutations to identify important functional residues within sets of related genes. We focus on protein kinases, a superfamily of phosphotransferases that share homologous sequences and structural motifs and have many connections to cancer. We develop several statistical tests for identifying Significantly Mutated Positions (SMPs), which are positions in an alignment with mutations that show signs of selection. We apply our methods to 21,917 mutations that map to the alignment of human kinases and identify 23 SMPs. SMPs occur throughout the alignment, with many in the important A-loop region, and others spread between the N and C lobes of the kinase domain. Since mutations are pooled across the superfamily, these positions may be important to many protein kinases. We select eleven mutations from these positions for functional validation. All eleven mutations cause a reduction or loss of function in the affected kinase. The tested mutations are from four genes, including two tumor suppressors (TGFBR1 and CHEK2) and two oncogenes (KDR and ERBB2). They also represent multiple cancer types, and include both recurrent and non-recurrent events. Many of these mutations warrant further investigation as potential cancer drivers.
Collapse
|
5
|
Pan X, Cang X, Dan S, Li J, Cheng J, Kang B, Duan X, Shen B, Wang YJ. Site-specific Disruption of the Oct4/Sox2 Protein Interaction Reveals Coordinated Mesendodermal Differentiation and the Epithelial-Mesenchymal Transition. J Biol Chem 2016; 291:18353-69. [PMID: 27369080 DOI: 10.1074/jbc.m116.745414] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2016] [Indexed: 12/15/2022] Open
Abstract
Although the Oct4/Sox2 complex is crucial for maintaining the pluripotency of stem cells, the molecular basis underlying its regulation during lineage-specific differentiation remains unknown. Here, we revealed that the highly conserved Oct4/Lys-156 is important for maintaining the stability of the Oct4 protein and the intermolecular salt bridge between Oct4/Lys-151 and Sox2/Asp-107 that contributes to the Oct4/Sox2 interaction. Post-translational modifications at Lys-156 and K156N, a somatic mutation detected in bladder cancer patients, both impaired the Lys-151-Asp-107 salt bridge and the Oct4/Sox2 interaction. When produced as a recombinant protein or overexpressed in pluripotent stem cells, Oct4/K156N, with reduced binding to Sox2, significantly down-regulated the stemness genes that are cooperatively controlled by the Oct4/Sox2 complex and specifically up-regulated the mesendodermal genes and the SNAIL family genes that promote the epithelial-mesenchymal transition. Thus, we conclude that Oct4/Lys-156-modulated Oct4/Sox2 interaction coordinately controls the epithelial-mesenchymal transition and mesendoderm specification induced by specific differentiation signals.
Collapse
Affiliation(s)
- Xiao Pan
- From the College of Life Sciences, Zhejiang University, 866 Yuhangtang Road, Hangzhou, Zhejiang 310058, China
| | - Xiaohui Cang
- From the College of Life Sciences, Zhejiang University, 866 Yuhangtang Road, Hangzhou, Zhejiang 310058, China
| | - Songsong Dan
- the State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, First Affiliated Hospital, School of Medicine, Zhejiang University, 79 QingChun Road, Hangzhou, Zhejiang 310003, China
| | - Jingchao Li
- From the College of Life Sciences, Zhejiang University, 866 Yuhangtang Road, Hangzhou, Zhejiang 310058, China
| | - Jie Cheng
- From the College of Life Sciences, Zhejiang University, 866 Yuhangtang Road, Hangzhou, Zhejiang 310058, China, the State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, First Affiliated Hospital, School of Medicine, Zhejiang University, 79 QingChun Road, Hangzhou, Zhejiang 310003, China
| | - Bo Kang
- the State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, First Affiliated Hospital, School of Medicine, Zhejiang University, 79 QingChun Road, Hangzhou, Zhejiang 310003, China
| | - Xiaotao Duan
- the State Key Laboratory of Toxicology and Medical Countermeasures, Beijing Institute of Pharmacology and Toxicology, Beijing 100850, China, and
| | - Binghui Shen
- the Department of Radiation Biology, City of Hope National Medical Center and Beckman Research Institute, Duarte, California 91010
| | - Ying-Jie Wang
- the State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, First Affiliated Hospital, School of Medicine, Zhejiang University, 79 QingChun Road, Hangzhou, Zhejiang 310003, China,
| |
Collapse
|
6
|
Pons T, Vazquez M, Matey-Hernandez ML, Brunak S, Valencia A, Izarzugaza JM. KinMutRF: a random forest classifier of sequence variants in the human protein kinase superfamily. BMC Genomics 2016; 17 Suppl 2:396. [PMID: 27357839 PMCID: PMC4928150 DOI: 10.1186/s12864-016-2723-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023] Open
Abstract
Background The association between aberrant signal processing by protein kinases and human diseases such as cancer was established long time ago. However, understanding the link between sequence variants in the protein kinase superfamily and the mechanistic complex traits at the molecular level remains challenging: cells tolerate most genomic alterations and only a minor fraction disrupt molecular function sufficiently and drive disease. Results KinMutRF is a novel random-forest method to automatically identify pathogenic variants in human kinases. Twenty six decision trees implemented as a random forest ponder a battery of features that characterize the variants: a) at the gene level, including membership to a Kinbase group and Gene Ontology terms; b) at the PFAM domain level; and c) at the residue level, the types of amino acids involved, changes in biochemical properties, functional annotations from UniProt, Phospho.ELM and FireDB. KinMutRF identifies disease-associated variants satisfactorily (Acc: 0.88, Prec:0.82, Rec:0.75, F-score:0.78, MCC:0.68) when trained and cross-validated with the 3689 human kinase variants from UniProt that have been annotated as neutral or pathogenic. All unclassified variants were excluded from the training set. Furthermore, KinMutRF is discussed with respect to two independent kinase-specific sets of mutations no included in the training and testing, Kin-Driver (643 variants) and Pon-BTK (1495 variants). Moreover, we provide predictions for the 848 protein kinase variants in UniProt that remained unclassified. A public implementation of KinMutRF, including documentation and examples, is available online (http://kinmut2.bioinfo.cnio.es). The source code for local installation is released under a GPL version 3 license, and can be downloaded from https://github.com/Rbbt-Workflows/KinMut2. Conclusions KinMutRF is capable of classifying kinase variation with good performance. Predictions by KinMutRF compare favorably in a benchmark with other state-of-the-art methods (i.e. SIFT, Polyphen-2, MutationAssesor, MutationTaster, LRT, CADD, FATHMM, and VEST). Kinase-specific features rank as the most elucidatory in terms of information gain and are likely the improvement in prediction performance. This advocates for the development of family-specific classifiers able to exploit the discriminatory power of features unique to individual protein families. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-2723-1) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Tirso Pons
- Structural Biology and BioComputing Programme, Spanish National Cancer Research Centre (CNIO), Melchor Fernández Almagro, 3, 28029, Madrid, Spain
| | - Miguel Vazquez
- Structural Biology and BioComputing Programme, Spanish National Cancer Research Centre (CNIO), Melchor Fernández Almagro, 3, 28029, Madrid, Spain
| | - María Luisa Matey-Hernandez
- Center for Biological Sequence Analysis (CBS), Systems Biology Department, Technical University of Denmark (DTU), Kemitorvet, Building 208, 2800 Kgs., Lyngby, Denmark
| | - Søren Brunak
- Center for Biological Sequence Analysis (CBS), Systems Biology Department, Technical University of Denmark (DTU), Kemitorvet, Building 208, 2800 Kgs., Lyngby, Denmark.,Novo Nordisk Foundation Center for Protein Research, Faculty of Health Sciences, University of Copenhagen, Blegdamsvej 3A, 2200, Copenhagen, Denmark
| | - Alfonso Valencia
- Structural Biology and BioComputing Programme, Spanish National Cancer Research Centre (CNIO), Melchor Fernández Almagro, 3, 28029, Madrid, Spain
| | - Jose Mg Izarzugaza
- Center for Biological Sequence Analysis (CBS), Systems Biology Department, Technical University of Denmark (DTU), Kemitorvet, Building 208, 2800 Kgs., Lyngby, Denmark.
| |
Collapse
|
7
|
Vazquez M, Pons T, Brunak S, Valencia A, Izarzugaza JMG. wKinMut-2: Identification and Interpretation of Pathogenic Variants in Human Protein Kinases. Hum Mutat 2015; 37:36-42. [PMID: 26443060 DOI: 10.1002/humu.22914] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2015] [Accepted: 09/22/2015] [Indexed: 12/31/2022]
Abstract
Most genomic alterations are tolerated while only a minor fraction disrupts molecular function sufficiently to drive disease. Protein kinases play a central biological function and the functional consequences of their variants are abundantly characterized. However, this heterogeneous information is often scattered across different sources, which makes the integrative analysis complex and laborious. wKinMut-2 constitutes a solution to facilitate the interpretation of the consequences of human protein kinase variation. Nine methods predict their pathogenicity, including a kinase-specific random forest approach. To understand the biological mechanisms causative of human diseases and cancer, information from pertinent reference knowledge bases and the literature is automatically mined, digested, and homogenized. Variants are visualized in their structural contexts and residues affecting catalytic and drug binding are identified. Known protein-protein interactions are reported. Altogether, this information is intended to assist the generation of new working hypothesis to be corroborated with ulterior experimental work. The wKinMut-2 system, along with a user manual and examples, is freely accessible at http://kinmut2.bioinfo.cnio.es, the code for local installations can be downloaded from https://github.com/Rbbt-Workflows/KinMut2.
Collapse
Affiliation(s)
- Miguel Vazquez
- Structural Biology and BioComputing Programme, Spanish National Cancer Research Centre (CNIO), Madrid, 28029, Spain
| | - Tirso Pons
- Structural Biology and BioComputing Programme, Spanish National Cancer Research Centre (CNIO), Madrid, 28029, Spain
| | - Søren Brunak
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health Sciences, University of Copenhagen, Copenhagen 2200, Denmark.,Center for Biological Sequence Analysis (CBS), Systems Biology Department, Technical University of Denmark (DTU), Kongens Lyngby 2800, Denmark
| | - Alfonso Valencia
- Structural Biology and BioComputing Programme, Spanish National Cancer Research Centre (CNIO), Madrid, 28029, Spain
| | - Jose M G Izarzugaza
- Center for Biological Sequence Analysis (CBS), Systems Biology Department, Technical University of Denmark (DTU), Kongens Lyngby 2800, Denmark
| |
Collapse
|
8
|
Sanz-Pamplona R, Lopez-Doriga A, Paré-Brunet L, Lázaro K, Bellido F, Alonso MH, Aussó S, Guinó E, Beltrán S, Castro-Giner F, Gut M, Sanjuan X, Closa A, Cordero D, Morón-Duran FD, Soriano A, Salazar R, Valle L, Moreno V. Exome Sequencing Reveals AMER1 as a Frequently Mutated Gene in Colorectal Cancer. Clin Cancer Res 2015; 21:4709-18. [PMID: 26071483 DOI: 10.1158/1078-0432.ccr-15-0159] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2015] [Accepted: 05/17/2015] [Indexed: 12/30/2022]
Abstract
PURPOSE Somatic mutations occur at early stages of adenoma and accumulate throughout colorectal cancer progression. The aim of this study was to characterize the mutational landscape of stage II tumors and to search for novel recurrent mutations likely implicated in colorectal cancer tumorigenesis. EXPERIMENTAL DESIGN The exomic DNA of 42 stage II, microsatellite-stable colon tumors and their paired mucosae were sequenced. Other molecular data available in the discovery dataset [gene expression, methylation, and copy number variations (CNV)] were used to further characterize these tumors. Additional datasets comprising 553 colorectal cancer samples were used to validate the discovered mutations. RESULTS As a result, 4,886 somatic single-nucleotide variants (SNV) were found. Almost all SNVs were private changes, with few mutations shared by more than one tumor, thus revealing tumor-specific mutational landscapes. Nevertheless, these diverse mutations converged into common cellular pathways, such as cell cycle or apoptosis. Among this mutational heterogeneity, variants resulting in early stop codons in the AMER1 (also known as FAM123B or WTX) gene emerged as recurrent mutations in colorectal cancer. Losses of AMER1 by other mechanisms apart from mutations such as methylation and copy number aberrations were also found. Tumors lacking this tumor suppressor gene exhibited a mesenchymal phenotype characterized by inhibition of the canonical Wnt pathway. CONCLUSIONS In silico and experimental validation in independent datasets confirmed the existence of functional mutations in AMER1 in approximately 10% of analyzed colorectal cancer tumors. Moreover, these tumors exhibited a characteristic phenotype.
Collapse
Affiliation(s)
- Rebeca Sanz-Pamplona
- Unit of Biomarkers and Susceptibility, Catalan Institute of Oncology (ICO), Bellvitge Biomedical Research Institute (IDIBELL) and CIBERESP, L'Hospitalet de Llobregat, Barcelona, Spain
| | - Adriana Lopez-Doriga
- Unit of Biomarkers and Susceptibility, Catalan Institute of Oncology (ICO), Bellvitge Biomedical Research Institute (IDIBELL) and CIBERESP, L'Hospitalet de Llobregat, Barcelona, Spain
| | - Laia Paré-Brunet
- Unit of Biomarkers and Susceptibility, Catalan Institute of Oncology (ICO), Bellvitge Biomedical Research Institute (IDIBELL) and CIBERESP, L'Hospitalet de Llobregat, Barcelona, Spain
| | - Kira Lázaro
- Unit of Biomarkers and Susceptibility, Catalan Institute of Oncology (ICO), Bellvitge Biomedical Research Institute (IDIBELL) and CIBERESP, L'Hospitalet de Llobregat, Barcelona, Spain
| | - Fernando Bellido
- Hereditary Cancer Program, Catalan Institute of Oncology (ICO), Bellvitge Biomedical Research Institute (IDIBELL), L'Hospitalet de Llobregat, Barcelona, Spain
| | - M Henar Alonso
- Unit of Biomarkers and Susceptibility, Catalan Institute of Oncology (ICO), Bellvitge Biomedical Research Institute (IDIBELL) and CIBERESP, L'Hospitalet de Llobregat, Barcelona, Spain
| | - Susanna Aussó
- Unit of Biomarkers and Susceptibility, Catalan Institute of Oncology (ICO), Bellvitge Biomedical Research Institute (IDIBELL) and CIBERESP, L'Hospitalet de Llobregat, Barcelona, Spain
| | - Elisabet Guinó
- Unit of Biomarkers and Susceptibility, Catalan Institute of Oncology (ICO), Bellvitge Biomedical Research Institute (IDIBELL) and CIBERESP, L'Hospitalet de Llobregat, Barcelona, Spain
| | - Sergi Beltrán
- Centre Nacional d'Anàlisi Genòmica (CNAG), Barcelona, Spain
| | | | - Marta Gut
- Centre Nacional d'Anàlisi Genòmica (CNAG), Barcelona, Spain
| | - Xavier Sanjuan
- Pathology Service, University Hospital Bellvitge (HUB-IDIBELL), L'Hospitalet de Llobregat, Barcelona, Spain
| | - Adria Closa
- Unit of Biomarkers and Susceptibility, Catalan Institute of Oncology (ICO), Bellvitge Biomedical Research Institute (IDIBELL) and CIBERESP, L'Hospitalet de Llobregat, Barcelona, Spain
| | - David Cordero
- Unit of Biomarkers and Susceptibility, Catalan Institute of Oncology (ICO), Bellvitge Biomedical Research Institute (IDIBELL) and CIBERESP, L'Hospitalet de Llobregat, Barcelona, Spain
| | - Francisco D Morón-Duran
- Unit of Biomarkers and Susceptibility, Catalan Institute of Oncology (ICO), Bellvitge Biomedical Research Institute (IDIBELL) and CIBERESP, L'Hospitalet de Llobregat, Barcelona, Spain
| | - Antonio Soriano
- Gastroenterology Service, University Hospital Bellvitge (HUB-IDIBELL), L'Hospitalet de Llobregat, Barcelona, Spain
| | - Ramón Salazar
- Department of Medical Oncology, Catalan Institute of Oncology (ICO), Bellvitge Biomedical Research Institute (IDIBELL), L'Hospitalet de Llobregat, Barcelona, Spain. Translational Research Laboratory, Catalan Institute of Oncology (ICO), Bellvitge Biomedical Research Institute (IDIBELL), L'Hospitalet de Llobregat, Barcelona, Spain
| | - Laura Valle
- Hereditary Cancer Program, Catalan Institute of Oncology (ICO), Bellvitge Biomedical Research Institute (IDIBELL), L'Hospitalet de Llobregat, Barcelona, Spain
| | - Victor Moreno
- Unit of Biomarkers and Susceptibility, Catalan Institute of Oncology (ICO), Bellvitge Biomedical Research Institute (IDIBELL) and CIBERESP, L'Hospitalet de Llobregat, Barcelona, Spain. Department of Clinical Sciences, Faculty of Medicine, University of Barcelona (UB), Barcelona, Spain.
| |
Collapse
|
9
|
Prediction and prioritization of rare oncogenic mutations in the cancer Kinome using novel features and multiple classifiers. PLoS Comput Biol 2014; 10:e1003545. [PMID: 24743239 PMCID: PMC3990476 DOI: 10.1371/journal.pcbi.1003545] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2013] [Accepted: 02/18/2014] [Indexed: 01/18/2023] Open
Abstract
Cancer is a genetic disease that develops through a series of somatic mutations, a subset of which drive cancer progression. Although cancer genome sequencing studies are beginning to reveal the mutational patterns of genes in various cancers, identifying the small subset of “causative” mutations from the large subset of “non-causative” mutations, which accumulate as a consequence of the disease, is a challenge. In this article, we present an effective machine learning approach for identifying cancer-associated mutations in human protein kinases, a class of signaling proteins known to be frequently mutated in human cancers. We evaluate the performance of 11 well known supervised learners and show that a multiple-classifier approach, which combines the performances of individual learners, significantly improves the classification of known cancer-associated mutations. We introduce several novel features related specifically to structural and functional characteristics of protein kinases and find that the level of conservation of the mutated residue at specific evolutionary depths is an important predictor of oncogenic effect. We consolidate the novel features and the multiple-classifier approach to prioritize and experimentally test a set of rare unconfirmed mutations in the epidermal growth factor receptor tyrosine kinase (EGFR). Our studies identify T725M and L861R as rare cancer-associated mutations inasmuch as these mutations increase EGFR activity in the absence of the activating EGF ligand in cell-based assays. Cancer progresses by accumulation of mutations in a subset of genes that confer growth advantage. The 518 protein kinase genes encoded in the human genome, collectively called the kinome, represent one of the largest families of oncogenes. Targeted sequencing studies of many different cancers have shown that the mutational landscape comprises both cancer-causing “driver” mutations and harmless “passenger” mutations. While the frequent recurrence of some driver mutations in human cancers helps distinguish them from the large number of passenger mutations, a significant challenge is to identify the rare “driver” mutations that are less frequently observed in patient samples and yet are causative. Here we combine computational and experimental approaches to identify rare cancer-associated mutations in Epidermal Growth Factor receptor kinase (EGFR), a signaling protein frequently mutated in cancers. Specifically, we evaluate a novel multiple-classifier approach and features specific to the protein kinase super-family in distinguishing known cancer-associated mutations from benign mutations. We then apply the multiple classifier to identify and test the functional impact of rare cancer-associated mutations in EGFR. We report, for the first time, that the EGFR mutations T725M and L861R, which are infrequently observed in cancers, constitutively activate EGFR in a manner analogous to the frequently observed driver mutations.
Collapse
|
10
|
Espinosa O, Mitsopoulos K, Hakas J, Pearl F, Zvelebil M. Deriving a mutation index of carcinogenicity using protein structure and protein interfaces. PLoS One 2014; 9:e84598. [PMID: 24454733 PMCID: PMC3893166 DOI: 10.1371/journal.pone.0084598] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2013] [Accepted: 11/16/2013] [Indexed: 11/29/2022] Open
Abstract
With the advent of Next Generation Sequencing the identification of mutations in the genomes of healthy and diseased tissues has become commonplace. While much progress has been made to elucidate the aetiology of disease processes in cancer, the contributions to disease that many individual mutations make remain to be characterised and their downstream consequences on cancer phenotypes remain to be understood. Missense mutations commonly occur in cancers and their consequences remain challenging to predict. However, this knowledge is becoming more vital, for both assessing disease progression and for stratifying drug treatment regimes. Coupled with structural data, comprehensive genomic databases of mutations such as the 1000 Genomes project and COSMIC give an opportunity to investigate general principles of how cancer mutations disrupt proteins and their interactions at the molecular and network level. We describe a comprehensive comparison of cancer and neutral missense mutations; by combining features derived from structural and interface properties we have developed a carcinogenicity predictor, InCa (Index of Carcinogenicity). Upon comparison with other methods, we observe that InCa can predict mutations that might not be detected by other methods. We also discuss general limitations shared by all predictors that attempt to predict driver mutations and discuss how this could impact high-throughput predictions. A web interface to a server implementation is publicly available at http://inca.icr.ac.uk/.
Collapse
Affiliation(s)
- Octavio Espinosa
- Breakthrough Breast Cancer Research Centre, Institute of Cancer Research, London, United Kingdom
| | - Konstantinos Mitsopoulos
- Breakthrough Breast Cancer Research Centre, Institute of Cancer Research, London, United Kingdom
| | - Jarle Hakas
- Breakthrough Breast Cancer Research Centre, Institute of Cancer Research, London, United Kingdom
| | - Frances Pearl
- UK Cancer Therapeutics Unit, The Institute of Cancer Research, London, United Kingdom
- Translational Drug Discovery Group, School of Life Sciences, University of Sussex, Brighton, United Kingdom
| | - Marketa Zvelebil
- Breakthrough Breast Cancer Research Centre, Institute of Cancer Research, London, United Kingdom
| |
Collapse
|
11
|
Izarzugaza JMG, Vazquez M, del Pozo A, Valencia A. wKinMut: an integrated tool for the analysis and interpretation of mutations in human protein kinases. BMC Bioinformatics 2013; 14:345. [PMID: 24289158 PMCID: PMC3879071 DOI: 10.1186/1471-2105-14-345] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2012] [Accepted: 05/30/2013] [Indexed: 11/13/2022] Open
Abstract
Background Protein kinases are involved in relevant physiological functions and a broad number of mutations in this superfamily have been reported in the literature to affect protein function and stability. Unfortunately, the exploration of the consequences on the phenotypes of each individual mutation remains a considerable challenge. Results The wKinMut web-server offers direct prediction of the potential pathogenicity of the mutations from a number of methods, including our recently developed prediction method based on the combination of information from a range of diverse sources, including physicochemical properties and functional annotations from FireDB and Swissprot and kinase-specific characteristics such as the membership to specific kinase groups, the annotation with disease-associated GO terms or the occurrence of the mutation in PFAM domains, and the relevance of the residues in determining kinase subfamily specificity from S3Det. This predictor yields interesting results that compare favourably with other methods in the field when applied to protein kinases. Together with the predictions, wKinMut offers a number of integrated services for the analysis of mutations. These include: the classification of the kinase, information about associations of the kinase with other proteins extracted from iHop, the mapping of the mutations onto PDB structures, pathogenicity records from a number of databases and the classification of mutations in large-scale cancer studies. Importantly, wKinMut is connected with the SNP2L system that extracts mentions of mutations directly from the literature, and therefore increases the possibilities of finding interesting functional information associated to the studied mutations. Conclusions wKinMut facilitates the exploration of the information available about individual mutations by integrating prediction approaches with the automatic extraction of information from the literature (text mining) and several state-of-the-art databases. wKinMut has been used during the last year for the analysis of the consequences of mutations in the context of a number of cancer genome projects, including the recent analysis of Chronic Lymphocytic Leukemia cases and is publicly available at
http://wkinmut.bioinfo.cnio.es.
Collapse
Affiliation(s)
- Jose M G Izarzugaza
- Structural Biology and BioComputing Programme, Spanish National Cancer Research Centre (CNIO), C/Melchor Fernandez Almagro, 3, E-28029 Madrid, Spain.
| | | | | | | |
Collapse
|
12
|
Stefl S, Nishi H, Petukh M, Panchenko AR, Alexov E. Molecular mechanisms of disease-causing missense mutations. J Mol Biol 2013; 425:3919-36. [PMID: 23871686 DOI: 10.1016/j.jmb.2013.07.014] [Citation(s) in RCA: 209] [Impact Index Per Article: 17.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2013] [Revised: 07/04/2013] [Accepted: 07/10/2013] [Indexed: 12/23/2022]
Abstract
Genetic variations resulting in a change of amino acid sequence can have a dramatic effect on stability, hydrogen bond network, conformational dynamics, activity and many other physiologically important properties of proteins. The substitutions of only one residue in a protein sequence, so-called missense mutations, can be related to many pathological conditions and may influence susceptibility to disease and drug treatment. The plausible effects of missense mutations range from affecting the macromolecular stability to perturbing macromolecular interactions and cellular localization. Here we review the individual cases and genome-wide studies that illustrate the association between missense mutations and diseases. In addition, we emphasize that the molecular mechanisms of effects of mutations should be revealed in order to understand the disease origin. Finally, we report the current state-of-the-art methodologies that predict the effects of mutations on protein stability, the hydrogen bond network, pH dependence, conformational dynamics and protein function.
Collapse
Affiliation(s)
- Shannon Stefl
- Computational Biophysics and Bioinformatics, Department of Physics, Clemson University, Clemson, SC 29634, USA
| | | | | | | | | |
Collapse
|
13
|
Residue mutations and their impact on protein structure and function: detecting beneficial and pathogenic changes. Biochem J 2013; 449:581-94. [DOI: 10.1042/bj20121221] [Citation(s) in RCA: 131] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
The present review focuses on the evolution of proteins and the impact of amino acid mutations on function from a structural perspective. Proteins evolve under the law of natural selection and undergo alternating periods of conservative evolution and of relatively rapid change. The likelihood of mutations being fixed in the genome depends on various factors, such as the fitness of the phenotype or the position of the residues in the three-dimensional structure. For example, co-evolution of residues located close together in three-dimensional space can occur to preserve global stability. Whereas point mutations can fine-tune the protein function, residue insertions and deletions (‘decorations’ at the structural level) can sometimes modify functional sites and protein interactions more dramatically. We discuss recent developments and tools to identify such episodic mutations, and examine their applications in medical research. Such tools have been tested on simulated data and applied to real data such as viruses or animal sequences. Traditionally, there has been little if any cross-talk between the fields of protein biophysics, protein structure–function and molecular evolution. However, the last several years have seen some exciting developments in combining these approaches to obtain an in-depth understanding of how proteins evolve. For example, a better understanding of how structural constraints affect protein evolution will greatly help us to optimize our models of sequence evolution. The present review explores this new synthesis of perspectives.
Collapse
|
14
|
Hashimoto K, Rogozin IB, Panchenko AR. Oncogenic potential is related to activating effect of cancer single and double somatic mutations in receptor tyrosine kinases. Hum Mutat 2012; 33:1566-75. [PMID: 22753356 PMCID: PMC3465464 DOI: 10.1002/humu.22145] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2012] [Accepted: 05/29/2012] [Indexed: 01/16/2023]
Abstract
Aberrant activation of receptor tyrosine kinases (RTKs) is a common feature of many cancer cells. It was previously suggested that the mechanisms of kinase activation in cancer might be linked to transitions between active and inactive states. Here, we estimate the effects of single and double cancer mutations on the stability of active and inactive states of the kinase domains from different RTKs. We show that singleton cancer mutations destabilize active and inactive states; however, inactive states are destabilized more than the active ones, leading to kinase activation. We show that there exists a relationship between the estimate of oncogenic potential of cancer mutation and kinase activation. Namely, more frequent mutations have a higher activating effect, which might allow us to predict the activating effect of the mutations from the mutation spectra. Independent evolutionary analysis of mutation spectra complements this observation and finds the same frequency threshold defining mutation hotspots. We analyze double mutations and report a positive epistasis and additional advantage of doublets with respect to cancer cell fitness. The activation mechanisms of double mutations differ from those of single mutations and double mutation spectrum is found to be dissimilar to the mutation spectrum of singletons.
Collapse
Affiliation(s)
| | - Igor B. Rogozin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Anna R. Panchenko
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| |
Collapse
|
15
|
Simonelli V, Mazzei F, D'Errico M, Dogliotti E. Reprint of: gene susceptibility to oxidative damage: from single nucleotide polymorphisms to function. Mutat Res 2012; 736:104-16. [PMID: 22732424 DOI: 10.1016/j.mrfmmm.2012.06.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2011] [Revised: 09/27/2011] [Accepted: 10/27/2011] [Indexed: 11/18/2022]
Abstract
Oxidative damage to DNA can cause mutations, and mutations can lead to cancer. DNA repair of oxidative damage should therefore play a pivotal role in defending humans against cancer. This is exemplified by the increased risk of colorectal cancer of patients with germ-line mutations of the oxidative damage DNA glycosylase MUTYH. In contrast to germ-line mutations in DNA repair genes, which cause a strong deficiency in DNA repair activity in all cell types, the role of single nucleotide polymorphisms (SNPs) in sporadic cancer is unclear also because deficiencies in DNA repair, if any, are expected to be much milder. Further slowing down progress are the paucity of accurate and reproducible functional assays and poor epidemiological design of many studies. This review will focus on the most common and widely studied SNPs of oxidative DNA damage repair proteins trying to bridge the information available on biochemical and structural features of the repair proteins with the functional effects of these variants and their potential impact on the pathogenesis of disease.
Collapse
Affiliation(s)
- Valeria Simonelli
- Department of Environment and Primary Prevention, Istituto Superiore di Sanità, Rome, Italy.
| | | | | | | |
Collapse
|
16
|
Valencia A, Hidalgo M. Getting personalized cancer genome analysis into the clinic: the challenges in bioinformatics. Genome Med 2012; 4:61. [PMID: 22839973 PMCID: PMC3580417 DOI: 10.1186/gm362] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Progress in genomics has raised expectations in many fields, and particularly in personalized cancer research. The new technologies available make it possible to combine information about potential disease markers, altered function and accessible drug targets, which, coupled with pathological and medical information, will help produce more appropriate clinical decisions. The accessibility of such experimental techniques makes it all the more necessary to improve and adapt computational strategies to the new challenges. This review focuses on the critical issues associated with the standard pipeline, which includes: DNA sequencing analysis; analysis of mutations in coding regions; the study of genome rearrangements; extrapolating information on mutations to the functional and signaling level; and predicting the effects of therapies using mouse tumor models. We describe the possibilities, limitations and future challenges of current bioinformatics strategies for each of these issues. Furthermore, we emphasize the need for the collaboration between the bioinformaticians who implement the software and use the data resources, the computational biologists who develop the analytical methods, and the clinicians, the systems' end users and those ultimately responsible for taking medical decisions. Finally, the different steps in cancer genome analysis are illustrated through examples of applications in cancer genome analysis.
Collapse
Affiliation(s)
- Alfonso Valencia
- Spanish National Cancer Research Centre (CNIO), Calle Melchor Fernández Almagro, 3, E-28029 Madrid, Spain
| | - Manuel Hidalgo
- Spanish National Cancer Research Centre (CNIO), Calle Melchor Fernández Almagro, 3, E-28029 Madrid, Spain
| |
Collapse
|
17
|
Izarzugaza JMG, del Pozo A, Vazquez M, Valencia A. Prioritization of pathogenic mutations in the protein kinase superfamily. BMC Genomics 2012; 13 Suppl 4:S3. [PMID: 22759651 PMCID: PMC3303724 DOI: 10.1186/1471-2164-13-s4-s3] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
BACKGROUND Most of the many mutations described in human protein kinases are tolerated without significant disruption of the corresponding structures or molecular functions, while some of them have been associated to a variety of human diseases, including cancer. In the last decade, a plethora of computational methods to predict the effect of missense single-nucleotide variants (SNVs) have been developed. Still, current high-throughput sequencing efforts and the concomitant need for massive interpretation of protein sequence variants will demand for more efficient and/or accurate computational methods in the forthcoming years. RESULTS We present KinMut, a support vector machine (SVM) approach, to identify pathogenic mutations in the protein kinase superfamily. KinMut relays on a combination of sequence-derived features that describe mutations at different levels: (1) Gene level: membership to a specific group in Kinbase and the annotation with GO terms; (2) Domain level: annotated PFAM domains; and (3) Residue level: physicochemical features of amino acids, specificity determining positions, and functional annotations from SwissProt and FireDB. The system has been trained with the set of 3492 human kinase mutations in UniProt for which experimental validation of their pathogenic or neutral character exists. In addition, we discuss the relative importance of these independent properties and their combination for the development of a kinase-specific predictor. Finally, we compare KinMut with other state-of-the-art prediction methods. CONCLUSIONS Family-specific features appear among the most discriminative information sources, which allow us to produce accurate results in a reliable and very simple way with minimal supervision. Our study aims to broaden the knowledge on the mechanisms by which mutations in the human kinome contribute to disease with a particular focus in cancer. The classifier as well as further documentation is available at http://kinmut.bioinfo.cnio.es/.
Collapse
Affiliation(s)
- Jose M G Izarzugaza
- Structural Biology and BioComputing Programme, Spanish National Cancer Research Centre (CNIO), Madrid, Spain.
| | | | | | | |
Collapse
|
18
|
Dudley JT, Kim Y, Liu L, Markov GJ, Gerold K, Chen R, Butte AJ, Kumar S. Human genomic disease variants: a neutral evolutionary explanation. Genome Res 2012; 22:1383-94. [PMID: 22665443 PMCID: PMC3409252 DOI: 10.1101/gr.133702.111] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Many perspectives on the role of evolution in human health include nonempirical assumptions concerning the adaptive evolutionary origins of human diseases. Evolutionary analyses of the increasing wealth of clinical and population genomic data have begun to challenge these presumptions. In order to systematically evaluate such claims, the time has come to build a common framework for an empirical and intellectual unification of evolution and modern medicine. We review the emerging evidence and provide a supporting conceptual framework that establishes the classical neutral theory of molecular evolution (NTME) as the basis for evaluating disease- associated genomic variations in health and medicine. For over a decade, the NTME has already explained the origins and distribution of variants implicated in diseases and has illuminated the power of evolutionary thinking in genomic medicine. We suggest that a majority of disease variants in modern populations will have neutral evolutionary origins (previously neutral), with a relatively smaller fraction exhibiting adaptive evolutionary origins (previously adaptive). This pattern is expected to hold true for common as well as rare disease variants. Ultimately, a neutral evolutionary perspective will provide medicine with an informative and actionable framework that enables objective clinical assessment beyond convenient tendencies to invoke past adaptive events in human history as a root cause of human disease.
Collapse
Affiliation(s)
- Joel T Dudley
- Program in Biomedical Informatics, Stanford University School of Medicine, Stanford, California 94305, USA
| | | | | | | | | | | | | | | |
Collapse
|
19
|
Simonelli V, Mazzei F, D'Errico M, Dogliotti E. Gene susceptibility to oxidative damage: from single nucleotide polymorphisms to function. Mutat Res 2012; 731:1-13. [PMID: 22155132 DOI: 10.1016/j.mrfmmm.2011.10.012] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2011] [Revised: 09/27/2011] [Accepted: 10/27/2011] [Indexed: 05/31/2023]
Abstract
Oxidative damage to DNA can cause mutations, and mutations can lead to cancer. DNA repair of oxidative damage should therefore play a pivotal role in defending humans against cancer. This is exemplified by the increased risk of colorectal cancer of patients with germ-line mutations of the oxidative damage DNA glycosylase MUTYH. In contrast to germ-line mutations in DNA repair genes, which cause a strong deficiency in DNA repair activity in all cell types, the role of single nucleotide polymorphisms (SNPs) in sporadic cancer is unclear also because deficiencies in DNA repair, if any, are expected to be much milder. Further slowing down progress are the paucity of accurate and reproducible functional assays and poor epidemiological design of many studies. This review will focus on the most common and widely studied SNPs of oxidative DNA damage repair proteins trying to bridge the information available on biochemical and structural features of the repair proteins with the functional effects of these variants and their potential impact on the pathogenesis of disease.
Collapse
Affiliation(s)
- Valeria Simonelli
- Department of Environment and Primary Prevention, Istituto Superiore di Sanità, Rome, Italy.
| | | | | | | |
Collapse
|
20
|
Levy R, Sobolev V, Edelman M. First- and second-shell metal binding residues in human proteins are disproportionately associated with disease-related SNPs. Hum Mutat 2011; 32:1309-18. [DOI: 10.1002/humu.21573] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2011] [Accepted: 07/06/2011] [Indexed: 11/10/2022]
|
21
|
Izarzugaza JMG, Hopcroft LEM, Baresic A, Orengo CA, Martin ACR, Valencia A. Characterization of pathogenic germline mutations in human protein kinases. BMC Bioinformatics 2011; 12 Suppl 4:S1. [PMID: 21992016 PMCID: PMC3194193 DOI: 10.1186/1471-2105-12-s4-s1] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023] Open
Abstract
BACKGROUND Protein Kinases are a superfamily of proteins involved in crucial cellular processes such as cell cycle regulation and signal transduction. Accordingly, they play an important role in cancer biology. To contribute to the study of the relation between kinases and disease we compared pathogenic mutations to neutral mutations as an extension to our previous analysis of cancer somatic mutations. First, we analyzed native and mutant proteins in terms of amino acid composition. Secondly, mutations were characterized according to their potential structural effects and finally, we assessed the location of the different classes of polymorphisms with respect to kinase-relevant positions in terms of subfamily specificity, conservation, accessibility and functional sites. RESULTS Pathogenic Protein Kinase mutations perturb essential aspects of protein function, including disruption of substrate binding and/or effector recognition at family-specific positions. Interestingly these mutations in Protein Kinases display a tendency to avoid structurally relevant positions, what represents a significant difference with respect to the average distribution of pathogenic mutations in other protein families. CONCLUSIONS Disease-associated mutations display sound differences with respect to neutral mutations: several amino acids are specific of each mutation type, different structural properties characterize each class and the distribution of pathogenic mutations within the consensus structure of the Protein Kinase domain is substantially different to that for non-pathogenic mutations. This preferential distribution confirms previous observations about the functional and structural distribution of the controversial cancer driver and passenger somatic mutations and their use as a proxy for the study of the involvement of somatic mutations in cancer development.
Collapse
Affiliation(s)
- Jose M G Izarzugaza
- Structural Biology and Biocomputing Programme, Spanish National Cancer Research Centre (CNIO), C/Melchor Fernandez Almagro 3, E28029 Madrid, Spain
| | | | | | | | | | | |
Collapse
|
22
|
Abstract
A key goal in cancer research is to find the genomic alterations that underlie malignant cells. Genomics has proved successful in identifying somatic variants at a large scale. However, it has become evident that a typical cancer exhibits a heterogenous mutation pattern across samples. Cases where the same alteration is observed repeatedly seem to be the exception rather than the norm. Thus, pinpointing the key alterations (driver mutations) from a background of variations with no direct causal link to cancer (passenger mutations) is difficult. Here we analyze somatic missense mutations from cancer samples and their healthy tissue counterparts (germline mutations) from the viewpoint of germline fitness. We calibrate a scoring system from protein domain alignments to score mutations and their target loci. We show first that this score predicts to a good degree the rate of polymorphism of the observed germline variation. The scoring is then applied to somatic mutations. We show that candidate cancer genes prone to copy number loss harbor mutations with germline fitness effects that are significantly more deleterious than expected by chance. This suggests that missense mutations play a driving role in tumor suppressor genes. Furthermore, these mutations fall preferably onto loci in sequence neighborhoods that are high scoring in terms of germline fitness. In contrast, for somatic mutations in candidate onco genes we do not observe a statistically significant effect. These results help to inform how to exploit germline fitness predictions in discovering new genes and mutations responsible for cancer.
Collapse
|
23
|
Robison K. Application of second-generation sequencing to cancer genomics. Brief Bioinform 2010; 11:524-34. [DOI: 10.1093/bib/bbq013] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
|
24
|
Dixit A, Yi L, Gowthaman R, Torkamani A, Schork NJ, Verkhivker GM. Sequence and structure signatures of cancer mutation hotspots in protein kinases. PLoS One 2009; 4:e7485. [PMID: 19834613 PMCID: PMC2759519 DOI: 10.1371/journal.pone.0007485] [Citation(s) in RCA: 61] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2009] [Accepted: 09/25/2009] [Indexed: 11/18/2022] Open
Abstract
Protein kinases are the most common protein domains implicated in cancer, where somatically acquired mutations are known to be functionally linked to a variety of cancers. Resequencing studies of protein kinase coding regions have emphasized the importance of sequence and structure determinants of cancer-causing kinase mutations in understanding of the mutation-dependent activation process. We have developed an integrated bioinformatics resource, which consolidated and mapped all currently available information on genetic modifications in protein kinase genes with sequence, structure and functional data. The integration of diverse data types provided a convenient framework for kinome-wide study of sequence-based and structure-based signatures of cancer mutations. The database-driven analysis has revealed a differential enrichment of SNPs categories in functional regions of the kinase domain, demonstrating that a significant number of cancer mutations could fall at structurally equivalent positions (mutational hotspots) within the catalytic core. We have also found that structurally conserved mutational hotspots can be shared by multiple kinase genes and are often enriched by cancer driver mutations with high oncogenic activity. Structural modeling and energetic analysis of the mutational hotspots have suggested a common molecular mechanism of kinase activation by cancer mutations, and have allowed to reconcile the experimental data. According to a proposed mechanism, structural effect of kinase mutations with a high oncogenic potential may manifest in a significant destabilization of the autoinhibited kinase form, which is likely to drive tumorigenesis at some level. Structure-based functional annotation and prediction of cancer mutation effects in protein kinases can facilitate an understanding of the mutation-dependent activation process and inform experimental studies exploring molecular pathology of tumorigenesis.
Collapse
Affiliation(s)
- Anshuman Dixit
- Graduate Program for Bioinformatics, Center for Bioinformatics, The University of Kansas, Lawrence, Kansas, United States of America
- Department of Pharmaceutical Chemistry, School of Pharmacy, The University of Kansas, Lawrence, Kansas, United States of America
| | - Lin Yi
- Graduate Program for Bioinformatics, Center for Bioinformatics, The University of Kansas, Lawrence, Kansas, United States of America
| | - Ragul Gowthaman
- Graduate Program for Bioinformatics, Center for Bioinformatics, The University of Kansas, Lawrence, Kansas, United States of America
| | - Ali Torkamani
- Scripps Genomic Medicine, Department of Molecular and Experimental Medicine, Scripps Health and The Scripps Research Institute, La Jolla, California, United States of America
| | - Nicholas J. Schork
- Scripps Genomic Medicine, Department of Molecular and Experimental Medicine, Scripps Health and The Scripps Research Institute, La Jolla, California, United States of America
| | - Gennady M. Verkhivker
- Graduate Program for Bioinformatics, Center for Bioinformatics, The University of Kansas, Lawrence, Kansas, United States of America
- Department of Pharmaceutical Chemistry, School of Pharmacy, The University of Kansas, Lawrence, Kansas, United States of America
- Department of Pharmacology, University of California San Diego, La Jolla, California, United States of America
- * E-mail:
| |
Collapse
|
25
|
Izarzugaza JMG, Baresic A, McMillan LEM, Yeats C, Clegg AB, Orengo CA, Martin ACR, Valencia A. An integrated approach to the interpretation of single amino acid polymorphisms within the framework of CATH and Gene3D. BMC Bioinformatics 2009; 10 Suppl 8:S5. [PMID: 19758469 PMCID: PMC2745587 DOI: 10.1186/1471-2105-10-s8-s5] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open
Abstract
BACKGROUND The phenotypic effects of sequence variations in protein-coding regions come about primarily via their effects on the resulting structures, for example by disrupting active sites or affecting structural stability. In order better to understand the mechanisms behind known mutant phenotypes, and predict the effects of novel variations, biologists need tools to gauge the impacts of DNA mutations in terms of their structural manifestation. Although many mutations occur within domains whose structure has been solved, many more occur within genes whose protein products have not been structurally characterized. RESULTS Here we present 3DSim (3D Structural Implication of Mutations), a database and web application facilitating the localization and visualization of single amino acid polymorphisms (SAAPs) mapped to protein structures even where the structure of the protein of interest is unknown. The server displays information on 6514 point mutations, 4865 of them known to be associated with disease. These polymorphisms are drawn from SAAPdb, which aggregates data from various sources including dbSNP and several pathogenic mutation databases. While the SAAPdb interface displays mutations on known structures, 3DSim projects mutations onto known sequence domains in Gene3D. This resource contains sequences annotated with domains predicted to belong to structural families in the CATH database. Mappings between domain sequences in Gene3D and known structures in CATH are obtained using a MUSCLE alignment. 1210 three-dimensional structures corresponding to CATH structural domains are currently included in 3DSim; these domains are distributed across 396 CATH superfamilies, and provide a comprehensive overview of the distribution of mutations in structural space. CONCLUSION The server is publicly available at http://3DSim.bioinfo.cnio.es/. In addition, the database containing the mapping between SAAPdb, Gene3D and CATH is available on request and most of the functionality is available through programmatic web service access.
Collapse
Affiliation(s)
- Jose M G Izarzugaza
- Institute of Structural and Molecular Biology, University College London, UK.
| | | | | | | | | | | | | | | |
Collapse
|