1
|
Piergentili R, Sechi S. Targeting Regulatory Noncoding RNAs in Human Cancer: The State of the Art in Clinical Trials. Pharmaceutics 2025; 17:471. [PMID: 40284466 PMCID: PMC12030637 DOI: 10.3390/pharmaceutics17040471] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2025] [Revised: 03/29/2025] [Accepted: 03/31/2025] [Indexed: 04/29/2025] Open
Abstract
Noncoding RNAs (ncRNAs) are a heterogeneous group of RNA molecules whose classification is mainly based on arbitrary criteria such as the molecule length, secondary structures, and cellular functions. A large fraction of these ncRNAs play a regulatory role regarding messenger RNAs (mRNAs) or other ncRNAs, creating an intracellular network of cross-interactions that allow the fine and complex regulation of gene expression. Altering the balance between these interactions may be sufficient to cause a transition from health to disease and vice versa. This leads to the possibility of intervening in these mechanisms to re-establish health in patients. The regulatory role of ncRNAs is associated with all cancer hallmarks, such as proliferation, apoptosis, invasion, metastasis, and genomic instability. Based on the function performed in carcinogenesis, ncRNAs may behave either as oncogenes or tumor suppressors. However, this distinction is not rigid; some ncRNAs can fall into both classes depending on the tissue considered or the target molecule. Furthermore, some of them are also involved in regulating the response to traditional cancer-therapeutic approaches. In general, the regulation of molecular mechanisms by ncRNAs is very complex and still largely unclear, but it has enormous potential both for the development of new therapies, especially in cases where traditional methods fail, and for their use as novel and more efficient biomarkers. Overall, this review will provide a brief overview of ncRNAs in human cancer biology, with a specific focus on describing the most recent ongoing clinical trials (CT) in which ncRNAs have been tested for their potential as therapeutic agents or evaluated as biomarkers.
Collapse
|
2
|
Zeng G, Zhao C, Li G, Huang Z, Zhuang J, Liang X, Yu X, Fang S. Identifying somatic driver mutations in cancer with a language model of the human genome. Comput Struct Biotechnol J 2025; 27:531-540. [PMID: 39968174 PMCID: PMC11833646 DOI: 10.1016/j.csbj.2025.01.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2024] [Revised: 01/12/2025] [Accepted: 01/14/2025] [Indexed: 02/20/2025] Open
Abstract
Somatic driver mutations play important roles in cancer and must be precisely identified to advance our understanding of tumorigenesis and its promotion and progression. However, identifying somatic driver mutations remains challenging in Homo sapiens genomics due to the random nature of mutations and the high cost of qualitative experiments. Building on the powerful sequence interpretation capabilities of language models, we propose a self-attention-based contextualized pretrained language model for somatic driver mutation identification. We pretrained the model with the Homo sapiens reference genome to equip it with the ability to understand genome sequences and then fine-tuned it for oncogene and tumor suppressor gene prediction tasks, enabling it to extract features related to driver genes from the original genome sequence. The fine-tuned model was used to obtain the mutations' carcinogenic effect characteristics to further identify whether the mutation is a driver or a passenger. Compared with other computational algorithms, our method achieved excellent somatic driver mutation identification performance on the test set, with an absolute improvement of 4.31% in AUROC over the best comparison method. The strong performance of our method indicates that it can provide new insights into the discovery of cancer drivers.
Collapse
Affiliation(s)
- Guangjian Zeng
- School of Biomedical Engineering, Shenzhen University, Shenzhen, China
- School of Public Health and Emergency Management, Southern University of Science and Technology, Shenzhen, China
| | - Chengzhi Zhao
- School of Public Health and Emergency Management, Southern University of Science and Technology, Shenzhen, China
| | - Guanpeng Li
- School of Public Health and Emergency Management, Southern University of Science and Technology, Shenzhen, China
| | - Zhengyang Huang
- School of Biomedical Engineering, Shenzhen University, Shenzhen, China
| | - Jinhu Zhuang
- Shenzhen Health Development Research and Data Management Center, Guangdong, China
| | - Xiaohua Liang
- Department of Clinical Epidemiology and Biostatistics, Children's Hospital of Chongqing Medical University, National Clinical Research Center for Child Health and Disorders, Ministry of Education Key Laboratory of Child Development and Disorders, Chongqing Key Laboratory of Pediatrics, Chongqing, China
| | - Xiaxia Yu
- School of Biomedical Engineering, Shenzhen University, Shenzhen, China
| | - Shenying Fang
- School of Public Health and Emergency Management, Southern University of Science and Technology, Shenzhen, China
| |
Collapse
|
3
|
Ouyang J. Transcription as a double-edged sword in genome maintenance. FEBS Lett 2025; 599:147-156. [PMID: 39704019 DOI: 10.1002/1873-3468.15080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2024] [Revised: 10/29/2024] [Accepted: 10/31/2024] [Indexed: 12/21/2024]
Abstract
Genome maintenance is essential for the integrity of the genetic blueprint, of which only a small fraction is transcribed in higher eukaryotes. DNA lesions occurring in the transcribed genome trigger transcription pausing and transcription-coupled DNA repair. There are two major transcription-coupled DNA repair pathways. The transcription-coupled nucleotide excision repair (TC-NER) pathway has been well studied for decades, while the transcription-coupled homologous recombination repair (TC-HR) pathway has recently gained attention. Importantly, recent studies have uncovered crucial roles of RNA transcripts in TC-HR, opening exciting directions for future research. Transcription also plays pivotal roles in regulating the stability of highly specialized genomic structures such as telomeres, centromeres, and fragile sites. Despite their positive function in genome maintenance, transcription and RNA transcripts can also be the sources of genomic instability, especially when colliding with DNA replication and forming unscheduled pathological RNA:DNA hybrids (R-loops), respectively. Pathological R-loops can result from transcriptional stress, which may be induced by transcription dysregulation. Future investigation into the interplay between transcription and DNA repair will reveal novel molecular bases for genome maintenance and transcriptional stress-associated genomic instability, providing therapeutic targets for human disease intervention.
Collapse
Affiliation(s)
- Jian Ouyang
- Department of Biochemistry and Molecular Biology
- Hollings Cancer Center, Medical University of South Carolina, Charleston, SC, USA
| |
Collapse
|
4
|
Wang J, Yang M, Ali O, Dragland JS, Bjørås M, Farkas L. Predicting regulatory mutations and their target genes by new computational integrative analysis: A study of follicular lymphoma. Comput Biol Med 2024; 178:108787. [PMID: 38901187 DOI: 10.1016/j.compbiomed.2024.108787] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Revised: 06/12/2024] [Accepted: 06/16/2024] [Indexed: 06/22/2024]
Abstract
Mutations in DNA regulatory regions are increasingly being recognized as important drivers of cancer and other complex diseases. These mutations can regulate gene expression by affecting DNA-protein binding and epigenetic profiles, such as DNA methylation in genome regulatory elements. However, identifying mutation hotspots associated with expression regulation and disease progression in non-coding DNA remains a challenge. Unlike most existing approaches that assign a mutation score to individual single nucleotide polymorphisms (SNP), a mutation block (MB)-based approach was introduced in this study to assess the collective impact of a cluster of SNPs on transcription factor-DNA binding affinity, differential gene expression (DEG), and nearby DNA methylation. Moreover, the long-distance target genes of functional MBs were identified using a new permutation-based algorithm that assessed the significance of correlations between DNA methylation at regulatory regions and target gene expression. Two new Python packages were developed. The Differential Methylation Region (DMR-analysis) analysis tool was used to detect DMR and map them to regulatory elements. The second tool, an integrated DMR, DEG, and SNP analysis tool (DDS-analysis), was used to combine the omics data to identify functional MBs and long-distance target genes. Both tools were validated in follicular lymphoma (FL) cohorts, where not only known functional MBs and their target genes (BCL2 and BCL6) were recovered, but also novel genes were found, including CDCA4 and JAG2, which may be associated with FL development. These genes are linked to target gene expression and are significantly correlated with the methylation of nearby DNA sequences in FL. The proposed computational integrative analysis of multiomics data holds promise for identifying regulatory mutations in cancer and other complex diseases.
Collapse
Affiliation(s)
- Junbai Wang
- Department of Clinical Molecular Biology (EpiGen), Akershus University Hospital and University of Oslo, Lørenskog, Norway; Institute of Clinical Medicine, Faculty of Medicine, University of Oslo, Campus AHUS/Oslo, Norway.
| | - Mingyi Yang
- Department of Microbiology, Oslo University Hospital, Oslo, Norway; Department of Medical Biochemistry, Oslo University Hospital, Oslo, Norway; Centre for Embryology and Healthy Development (CRESCO), University of Oslo, Oslo, 0373, Norway
| | - Omer Ali
- Institute of Clinical Medicine, Faculty of Medicine, University of Oslo, Campus AHUS/Oslo, Norway; Department of Pathology, Oslo University Hospital - Norwegian Radium Hospital, Oslo, Norway
| | - Jenny Sofie Dragland
- Department of Pathology, Oslo University Hospital - Norwegian Radium Hospital, Oslo, Norway
| | - Magnar Bjørås
- Department of Microbiology, Oslo University Hospital, Oslo, Norway; Department of Clinical and Molecular Medicine, Norwegian University of Science and Technology, Trondheim, Norway; Centre for Embryology and Healthy Development (CRESCO), University of Oslo, Oslo, 0373, Norway
| | - Lorant Farkas
- Institute of Clinical Medicine, Faculty of Medicine, University of Oslo, Campus AHUS/Oslo, Norway; Department of Pathology, Oslo University Hospital - Norwegian Radium Hospital, Oslo, Norway
| |
Collapse
|
5
|
Wang M, Li SC, Shen B. Elevated incidence of somatic mutations at prevalent genetic sites. Brief Bioinform 2024; 25:bbae065. [PMID: 38426321 PMCID: PMC10939422 DOI: 10.1093/bib/bbae065] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Revised: 01/03/2024] [Accepted: 02/03/2024] [Indexed: 03/02/2024] Open
Abstract
The common loci represent a distinct set of the human genome sites that harbor genetic variants found in at least 1% of the population. Small somatic mutations occur at the common loci and non-common loci, i.e. csmVariants and ncsmVariants, are presumed with similar probabilities. However, our work revealed that within the coding region, common loci constituted only 1.03% of all loci, yet they accounted for 5.14% of TCGA somatic mutations. Furthermore, the small somatic mutation incidence rate at these common loci was 2.7 times that observed in the non-common. Notably, the csmVariants exhibited an impressive recurrent rate of 36.14%, which was 2.59 times of the ncsmVariants. The C-to-T transition at the CpG sites accounted for 32.41% of the csmVariants, which was 2.93 times for the ncsmVariants. Interestingly, the aging-related mutational signature contributed to 13.87% of the csmVariants, 5.5 times that of ncsmVariants. Moreover, 35.93% of the csmVariants contexts exhibited palindromic features, outperforming ncsmVariant contexts by 1.84 times. Notably, cancer patients with higher csmVariants rates had better progression-free survival. Furthermore, cancer patients with high-frequency csmVariants enriched with mismatch repair deficiency were also associated with better progression-free survival. The accumulation of csmVariants during cancerogenesis is a complex process influenced by various factors. These include the presence of a substantial percentage of palindromic sequences at csmVariants sites, the impact of aging and DNA mismatch repair deficiency. Together, these factors contribute to the higher somatic mutation incidence rates of common loci and the overall accumulation of csmVariants in cancer development.
Collapse
Affiliation(s)
- Mengyao Wang
- Department of Computer Science, City University of Hong Kong, 83 Tat Chee Ave, Kowloon Tong, Hong Kong, China
- Laboratory of Gastrointestinal Cancer (Fujian Medical University), Ministry of Education, Fuzhou, China
| | - Shuai Cheng Li
- Department of Computer Science, City University of Hong Kong, 83 Tat Chee Ave, Kowloon Tong, Hong Kong, China
| | - Bairong Shen
- Institutes for Systems Genetics, Frontiers Science Center for Disease-Related Molecular Network, West China Hospital, Sichuan University, Chengdu, Sichuan, China
- Joint Laboratory of Artificial Intelligence for Critical Care Medicine, Department of Critical Care Medicine and Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, 610212, Chengdu, China
| |
Collapse
|
6
|
Carrasco Pro S, Hook H, Bray D, Berenzy D, Moyer D, Yin M, Labadorf AT, Tewhey R, Siggers T, Fuxman Bass JI. Widespread perturbation of ETS factor binding sites in cancer. Nat Commun 2023; 14:913. [PMID: 36808133 PMCID: PMC9938127 DOI: 10.1038/s41467-023-36535-8] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Accepted: 02/03/2023] [Indexed: 02/19/2023] Open
Abstract
Although >90% of somatic mutations reside in non-coding regions, few have been reported as cancer drivers. To predict driver non-coding variants (NCVs), we present a transcription factor (TF)-aware burden test based on a model of coherent TF function in promoters. We apply this test to NCVs from the Pan-Cancer Analysis of Whole Genomes cohort and predict 2555 driver NCVs in the promoters of 813 genes across 20 cancer types. These genes are enriched in cancer-related gene ontologies, essential genes, and genes associated with cancer prognosis. We find that 765 candidate driver NCVs alter transcriptional activity, 510 lead to differential binding of TF-cofactor regulatory complexes, and that they primarily impact the binding of ETS factors. Finally, we show that different NCVs within a promoter often affect transcriptional activity through shared mechanisms. Our integrated computational and experimental approach shows that cancer NCVs are widespread and that ETS factors are commonly disrupted.
Collapse
Affiliation(s)
| | - Heather Hook
- Department of Biology, Boston University, Boston, MA, USA
| | - David Bray
- Bioinformatics Program, Boston University, Boston, MA, USA
| | | | - Devlin Moyer
- Bioinformatics Program, Boston University, Boston, MA, USA
| | - Meimei Yin
- Department of Biology, Boston University, Boston, MA, USA
| | - Adam Thomas Labadorf
- Bioinformatics Hub, Boston University, Boston, MA, USA
- Boston University School of Medicine, Department of Neurology, Boston, MA, USA
| | | | - Trevor Siggers
- Bioinformatics Program, Boston University, Boston, MA, USA.
- Department of Biology, Boston University, Boston, MA, USA.
- Biological Design Center, Boston University, Boston, MA, USA.
| | - Juan Ignacio Fuxman Bass
- Bioinformatics Program, Boston University, Boston, MA, USA.
- Department of Biology, Boston University, Boston, MA, USA.
| |
Collapse
|
7
|
Pudjihartono M, Perry JK, Print C, O'Sullivan JM, Schierding W. Interpretation of the role of germline and somatic non-coding mutations in cancer: expression and chromatin conformation informed analysis. Clin Epigenetics 2022; 14:120. [PMID: 36171609 PMCID: PMC9520844 DOI: 10.1186/s13148-022-01342-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Accepted: 09/21/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND There has been extensive scrutiny of cancer driving mutations within the exome (especially amino acid altering mutations) as these are more likely to have a clear impact on protein functions, and thus on cell biology. However, this has come at the neglect of systematic identification of regulatory (non-coding) variants, which have recently been identified as putative somatic drivers and key germline risk factors for cancer development. Comprehensive understanding of non-coding mutations requires understanding their role in the disruption of regulatory elements, which then disrupt key biological functions such as gene expression. MAIN BODY We describe how advancements in sequencing technologies have led to the identification of a large number of non-coding mutations with uncharacterized biological significance. We summarize the strategies that have been developed to interpret and prioritize the biological mechanisms impacted by non-coding mutations, focusing on recent annotation of cancer non-coding variants utilizing chromatin states, eQTLs, and chromatin conformation data. CONCLUSION We believe that a better understanding of how to apply different regulatory data types into the study of non-coding mutations will enhance the discovery of novel mechanisms driving cancer.
Collapse
Affiliation(s)
| | - Jo K Perry
- Liggins Institute, The University of Auckland, Auckland, New Zealand
- The Maurice Wilkins Centre, The University of Auckland, Auckland, New Zealand
| | - Cris Print
- The Maurice Wilkins Centre, The University of Auckland, Auckland, New Zealand
- Department of Molecular Medicine and Pathology, School of Medical Sciences, University of Auckland, Auckland, 1142, New Zealand
| | - Justin M O'Sullivan
- Liggins Institute, The University of Auckland, Auckland, New Zealand
- The Maurice Wilkins Centre, The University of Auckland, Auckland, New Zealand
- Australian Parkinson's Mission, Garvan Institute of Medical Research, Sydney, NSW, Australia
- MRC Lifecourse Epidemiology Unit, University of Southampton, Southampton, UK
| | - William Schierding
- Liggins Institute, The University of Auckland, Auckland, New Zealand.
- The Maurice Wilkins Centre, The University of Auckland, Auckland, New Zealand.
| |
Collapse
|
8
|
Lirussi L, Ayyildiz D, Liu Y, Montaldo NP, Carracedo S, Aure MR, Jobert L, Tekpli X, Touma J, Sauer T, Dalla E, Kristensen VN, Geisler J, Piazza S, Tell G, Nilsen H. A regulatory network comprising let-7 miRNA and SMUG1 is associated with good prognosis in ER+ breast tumours. Nucleic Acids Res 2022; 50:10449-10468. [PMID: 36156150 PMCID: PMC9561369 DOI: 10.1093/nar/gkac807] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2022] [Revised: 08/31/2022] [Accepted: 09/09/2022] [Indexed: 11/13/2022] Open
Abstract
Single-strand selective uracil-DNA glycosylase 1 (SMUG1) initiates base excision repair (BER) of uracil and oxidized pyrimidines. SMUG1 status has been associated with cancer risk and therapeutic response in breast carcinomas and other cancer types. However, SMUG1 is a multifunctional protein involved, not only, in BER but also in RNA quality control, and its function in cancer cells is unclear. Here we identify several novel SMUG1 interaction partners that functions in many biological processes relevant for cancer development and treatment response. Based on this, we hypothesized that the dominating function of SMUG1 in cancer might be ascribed to functions other than BER. We define a bad prognosis signature for SMUG1 by mapping out the SMUG1 interaction network and found that high expression of genes in the bad prognosis network correlated with lower survival probability in ER+ breast cancer. Interestingly, we identified hsa-let-7b-5p microRNA as an upstream regulator of the SMUG1 interactome. Expression of SMUG1 and hsa-let-7b-5p were negatively correlated in breast cancer and we found an inhibitory auto-regulatory loop between SMUG1 and hsa-let-7b-5p in the MCF7 breast cancer cells. We conclude that SMUG1 functions in a gene regulatory network that influence the survival and treatment response in several cancers.
Collapse
Affiliation(s)
- Lisa Lirussi
- Institute of Clinical Medicine, Department of Clinical Molecular Biology, University of Oslo, N-0318 Oslo, Norway.,Section of Clinical Molecular Biology, Akershus University Hospital (AHUS), 1478 Lørenskog, Norway
| | - Dilara Ayyildiz
- Laboratory of Molecular Biology and DNA repair, Department of Medicine, University of Udine, p.le M. Kolbe 4, 33100 Udine, Italy
| | - Yan Liu
- Section of Clinical Molecular Biology, Akershus University Hospital (AHUS), 1478 Lørenskog, Norway
| | - Nicola P Montaldo
- Institute of Clinical Medicine, Department of Clinical Molecular Biology, University of Oslo, N-0318 Oslo, Norway
| | - Sergio Carracedo
- Institute of Clinical Medicine, Department of Clinical Molecular Biology, University of Oslo, N-0318 Oslo, Norway.,Section of Clinical Molecular Biology, Akershus University Hospital (AHUS), 1478 Lørenskog, Norway
| | - Miriam R Aure
- Department of Medical Genetics, Institute of Clinical Medicine, Faculty of Medicine, University of Oslo and Oslo University Hospital, 0450 Oslo, Norway
| | - Laure Jobert
- Institute of Clinical Medicine, Department of Clinical Molecular Biology, University of Oslo, N-0318 Oslo, Norway
| | - Xavier Tekpli
- Department of Medical Genetics, Institute of Clinical Medicine, Faculty of Medicine, University of Oslo and Oslo University Hospital, 0450 Oslo, Norway
| | - Joel Touma
- Department of Breast and Endocrine Surgery, Akershus University Hospital (AHUS), 1478 Lørenskog, Norway.,Institute of Clinical Medicine, University of Oslo, Campus AHUS, 1478 Lørenskog, Norway
| | - Torill Sauer
- Institute of Clinical Medicine, University of Oslo, Campus AHUS, 1478 Lørenskog, Norway.,Department of Pathology, Akershus University Hospital, 1478 Lørenskog, Norway
| | - Emiliano Dalla
- Laboratory of Molecular Biology and DNA repair, Department of Medicine, University of Udine, p.le M. Kolbe 4, 33100 Udine, Italy
| | - Vessela N Kristensen
- Department of Medical Genetics, Institute of Clinical Medicine, Faculty of Medicine, University of Oslo and Oslo University Hospital, 0450 Oslo, Norway.,Department of Pathology, Akershus University Hospital, 1478 Lørenskog, Norway
| | - Jürgen Geisler
- Institute of Clinical Medicine, University of Oslo, Campus AHUS, 1478 Lørenskog, Norway.,Department of Oncology, Akershus University Hospital (AHUS), 1478 Lørenskog, Norway
| | - Silvano Piazza
- Bioinformatics Core Facility, Centre for Integrative Biology (CIBIO), University of Trento, via Sommarive 18, 38123, Povo (Trento), Italy
| | - Gianluca Tell
- Laboratory of Molecular Biology and DNA repair, Department of Medicine, University of Udine, p.le M. Kolbe 4, 33100 Udine, Italy
| | - Hilde Nilsen
- Institute of Clinical Medicine, Department of Clinical Molecular Biology, University of Oslo, N-0318 Oslo, Norway.,Section of Clinical Molecular Biology, Akershus University Hospital (AHUS), 1478 Lørenskog, Norway.,Department of Microbiology, Oslo University Hospital, N-0424 Oslo, Norway
| |
Collapse
|
9
|
Mas-Ponte D, McCullough M, Supek F. Spectrum of DNA mismatch repair failures viewed through the lens of cancer genomics and implications for therapy. Clin Sci (Lond) 2022; 136:383-404. [PMID: 35274136 PMCID: PMC8919091 DOI: 10.1042/cs20210682] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2021] [Revised: 02/02/2022] [Accepted: 02/28/2022] [Indexed: 12/15/2022]
Abstract
Genome sequencing can be used to detect DNA repair failures in tumors and learn about underlying mechanisms. Here, we synthesize findings from genomic studies that examined deficiencies of the DNA mismatch repair (MMR) pathway. The impairment of MMR results in genome-wide hypermutation and in the 'microsatellite instability' (MSI) phenotype-occurrence of indel mutations at short tandem repeat (microsatellite) loci. The MSI status of tumors was traditionally assessed by molecular testing of a selected set of MS loci or by measuring MMR protein expression levels. Today, genomic data can provide a more complete picture of the consequences on genomic instability. Multiple computational studies examined somatic mutation distributions that result from failed DNA repair pathways in tumors. These include analyzing the commonly studied trinucleotide mutational spectra of single-nucleotide variants (SNVs), as well as of other features such as indels, structural variants, mutation clusters and regional mutation rate redistribution. The identified mutation patterns can be used to rigorously measure prevalence of MMR failures across cancer types, and potentially to subcategorize the MMR deficiencies. Diverse data sources, genomic and pre-genomic, from human and from experimental models, suggest there are different ways in which MMR can fail, and/or that the cell-type or genetic background may result in different types of MMR mutational patterns. The spectrum of MMR failures may direct cancer evolution, generating particular sets of driver mutations. Moreover, MMR affects outcomes of therapy by DNA damaging drugs, antimetabolites, nonsense-mediated mRNA decay (NMD) inhibitors, and immunotherapy by promoting either resistance or sensitivity, depending on the type of therapy.
Collapse
Affiliation(s)
- David Mas-Ponte
- Genome Data Science, Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute for Science and Technology, Baldiri Reixac 10, Barcelona 08028, Spain
| | - Marcel McCullough
- Genome Data Science, Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute for Science and Technology, Baldiri Reixac 10, Barcelona 08028, Spain
| | - Fran Supek
- Genome Data Science, Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute for Science and Technology, Baldiri Reixac 10, Barcelona 08028, Spain
- Catalan Institution for Research and Advanced Studies (ICREA), Pg Lluís Companys, 23, Barcelona 08010, Spain
| |
Collapse
|
10
|
Dressler L, Bortolomeazzi M, Keddar MR, Misetic H, Sartini G, Acha-Sagredo A, Montorsi L, Wijewardhane N, Repana D, Nulsen J, Goldman J, Pollitt M, Davis P, Strange A, Ambrose K, Ciccarelli FD. Comparative assessment of genes driving cancer and somatic evolution in non-cancer tissues: an update of the Network of Cancer Genes (NCG) resource. Genome Biol 2022; 23:35. [PMID: 35078504 PMCID: PMC8790917 DOI: 10.1186/s13059-022-02607-z] [Citation(s) in RCA: 46] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Accepted: 01/10/2022] [Indexed: 12/30/2022] Open
Abstract
Background Genetic alterations of somatic cells can drive non-malignant clone formation and promote cancer initiation. However, the link between these processes remains unclear and hampers our understanding of tissue homeostasis and cancer development. Results Here, we collect a literature-based repertoire of 3355 well-known or predicted drivers of cancer and non-cancer somatic evolution in 122 cancer types and 12 non-cancer tissues. Mapping the alterations of these genes in 7953 pan-cancer samples reveals that, despite the large size, the known compendium of drivers is still incomplete and biased towards frequently occurring coding mutations. High overlap exists between drivers of cancer and non-cancer somatic evolution, although significant differences emerge in their recurrence. We confirm and expand the unique properties of drivers and identify a core of evolutionarily conserved and essential genes whose germline variation is strongly counter-selected. Somatic alteration in even one of these genes is sufficient to drive clonal expansion but not malignant transformation. Conclusions Our study offers a comprehensive overview of our current understanding of the genetic events initiating clone expansion and cancer revealing significant gaps and biases that still need to be addressed. The compendium of cancer and non-cancer somatic drivers, their literature support, and properties are accessible in the Network of Cancer Genes and Healthy Drivers resource at http://www.network-cancer-genes.org/. Supplementary Information The online version contains supplementary material available at 10.1186/s13059-022-02607-z.
Collapse
|
11
|
Wong JKL, Aichmüller C, Schulze M, Hlevnjak M, Elgaafary S, Lichter P, Zapatka M. Association of mutation signature effectuating processes with mutation hotspots in driver genes and non-coding regions. Nat Commun 2022; 13:178. [PMID: 35013316 PMCID: PMC8748499 DOI: 10.1038/s41467-021-27792-6] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2020] [Accepted: 12/09/2021] [Indexed: 02/06/2023] Open
Abstract
Cancer driving mutations are difficult to identify especially in the non-coding part of the genome. Here, we present sigDriver, an algorithm dedicated to call driver mutations. Using 3813 whole-genome sequenced tumors from International Cancer Genome Consortium, The Cancer Genome Atlas Program, and a childhood pan-cancer cohort, we employ mutational signatures based on single-base substitution in the context of tri- and penta-nucleotide motifs for hotspot discovery. Knowledge-based annotations on mutational hotspots reveal enrichment in coding regions and regulatory elements for 6 mutational signatures, including APOBEC and somatic hypermutation signatures. APOBEC activity is associated with 32 hotspots of which 11 are known and 11 are putative regulatory drivers. Somatic single nucleotide variants clusters detected at hypermutation-associated hotspots are distinct from translocation or gene amplifications. Patients carrying APOBEC induced PIK3CA driver mutations show lower occurrence of signature SBS39. In summary, sigDriver uncovers mutational processes associated with known and putative tumor drivers and hotspots particularly in the non-coding regions of the genome.
Collapse
Affiliation(s)
- John K L Wong
- Division of Molecular Genetics and German Cancer Consortium (DKTK), German Cancer Research Center (DKFZ), Heidelberg, Germany.
| | - Christian Aichmüller
- Division of Molecular Genetics and German Cancer Consortium (DKTK), German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Markus Schulze
- Computational Oncology Group, Molecular Precision Oncology Program, National Center for Tumor Diseases (NCT) and DKFZ, Heidelberg, Germany
| | - Mario Hlevnjak
- Computational Oncology Group, Molecular Precision Oncology Program, National Center for Tumor Diseases (NCT) and DKFZ, Heidelberg, Germany
| | - Shaymaa Elgaafary
- Gynecologic Oncology, National Center for Tumor Diseases (NCT) and University of Heidelberg, Heidelberg, Germany
- Molecular Precision Oncology Program at the National Center for Tumor Diseases (NCT) and DKFZ, Heidelberg, Germany
| | - Peter Lichter
- Division of Molecular Genetics and German Cancer Consortium (DKTK), German Cancer Research Center (DKFZ), Heidelberg, Germany
- Molecular Precision Oncology Program at the National Center for Tumor Diseases (NCT) and DKFZ, Heidelberg, Germany
| | - Marc Zapatka
- Division of Molecular Genetics and German Cancer Consortium (DKTK), German Cancer Research Center (DKFZ), Heidelberg, Germany.
| |
Collapse
|
12
|
Liu B, Liu Y, Zou J, Zou M, Cheng Z. Smoking is Associated with Lung Adenocarcinoma and Lung Squamous Cell Carcinoma Progression through Inducing Distinguishing lncRNA Alterations in Different Genders. Anticancer Agents Med Chem 2021; 22:1541-1550. [PMID: 34315392 DOI: 10.2174/1871520621666210727115147] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2020] [Revised: 06/08/2021] [Accepted: 06/21/2021] [Indexed: 11/22/2022]
Abstract
BACKGROUND Smoking participates in pathogenesis of lung cancer. Long non-coding RNAs (lncRNAs) play some specific roles during development of lung cancers. OBJECTIVE To investigate effects of smoking on lncRNA alterations in lung cancer. METHODS There are 522 lung adenocarcinoma (LUAD) and 504 lung squamous cell carcinoma (LUSC) participants. Clinical and lncRNA genetic data were downloaded from The Cancer Genome Atlas (TCGA) database. LncRNA alterations were analyzed in lung cancer patients. Smoking category and packs were evaluated. Correlations between smoking and LncRNA alterations were analyzed. Kaplan-Meier analysis was performed to determine overall survival and disease free survival. RESULTS There are more non-smokers in LUSC than in LUAD. In both LUAD and LUSC, smoking could increase total mutation counts and fraction of copy number alterations. Smoking index positively correlated with total mutations in LUAD, but not in LUSC. Smoking could trigger lncRNA alterations both in LUAD and LUSC. Smoking regulated different lncRNA between male and female. EXOC3-AS1 and LINC00603 alterations were positively correlated with smoking index in male LUAD smokers. In female LUAD smokers, smoking index was positively correlated with SNHG15, TP53TG1 and LINC01600 and negatively with LINC00609 and PTCSC3. In both male and female LUSC patients, smoking increased or decreased several lncRNA alterations. DGCR5 alteration increased in male LUSC than in female LUSC patients. In female LUSC patients, LOH12CR2 alteration was positively correlated with smoking index. CONCLUSIONS Smoking promoted LUAD and LUSC development by affecting different lncRNA alterations in different genders.
Collapse
Affiliation(s)
- Bing Liu
- Department of Respiratory Medicine, Zhongnan Hospital of Wuhan University, Wuhan 430071, China
| | - Yuan Liu
- Department of Respiratory Medicine, Zhongnan Hospital of Wuhan University, Wuhan 430071, China
| | - Jingfeng Zou
- Department of Respiratory Medicine, Zhongnan Hospital of Wuhan University, Wuhan 430071, China
| | - Menglin Zou
- Department of Respiratory Medicine, Zhongnan Hospital of Wuhan University, Wuhan 430071, China
| | - Zhenshun Cheng
- Department of Respiratory Medicine, Zhongnan Hospital of Wuhan University, Wuhan 430071, China
| |
Collapse
|
13
|
The landscape and driver potential of site-specific hotspots across cancer genomes. NPJ Genom Med 2021; 6:33. [PMID: 33986299 PMCID: PMC8119706 DOI: 10.1038/s41525-021-00197-6] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Accepted: 04/15/2021] [Indexed: 11/09/2022] Open
Abstract
Large sets of whole cancer genomes make it possible to study mutation hotspots genome-wide. Here we detect, categorize, and characterize site-specific hotspots using 2279 whole cancer genomes from the Pan-Cancer Analysis of Whole Genomes project and provide a resource of annotated hotspots genome-wide. We investigate the excess of hotspots in both protein-coding and gene regulatory regions and develop measures of positive selection and functional impact for individual hotspots. Using cancer allele fractions, expression aberrations, mutational signatures, and a variety of genomic features, such as potential gain or loss of transcription factor binding sites, we annotate and prioritize all highly mutated hotspots. Genome-wide we find more high-frequency SNV and indel hotspots than expected given mutational background models. Protein-coding regions are generally enriched for SNV hotspots compared to other regions. Gene regulatory hotspots show enrichment of potential same-patient second-hit missense mutations, consistent with enrichment of hotspot driver mutations compared to singletons. For protein-coding regions, splice-sites, promoters, and enhancers, we see an excess of hotspots associated with cancer genes. Interestingly, missense hotspot mutations in tumor suppressors are associated with elevated expression, suggesting localized amino-acid changes with functional impact. For individual non-coding hotspots, only a small number show clear signs of positive selection, including known sites in the TERT promoter and the 5' UTR of TP53. Most of the new candidates have few mutations and limited driver evidence. However, a hotspot in an enhancer of the oncogene POU2AF1, which may create a transcription factor binding site, presents multiple lines of driver-consistent evidence.
Collapse
|
14
|
Martinez-Ledesma E, Flores D, Trevino V. Computational methods for detecting cancer hotspots. Comput Struct Biotechnol J 2020; 18:3567-3576. [PMID: 33304455 PMCID: PMC7711189 DOI: 10.1016/j.csbj.2020.11.020] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2020] [Revised: 11/12/2020] [Accepted: 11/13/2020] [Indexed: 12/14/2022] Open
Abstract
Cancer mutations that are recurrently observed among patients are known as hotspots. Hotspots are highly relevant because they are, presumably, likely functional. Known hotspots in BRAF, PIK3CA, TP53, KRAS, IDH1 support this idea. However, hundreds of hotspots have never been validated experimentally. The detection of hotspots nevertheless is challenging because background mutations obscure their statistical and computational identification. Although several algorithms have been applied to identify hotspots, they have not been reviewed before. Thus, in this mini-review, we summarize more than 40 computational methods applied to detect cancer hotspots in coding and non-coding DNA. We first organize the methods in cluster-based, 3D, position-specific, and miscellaneous to provide a general overview. Then, we describe their embed procedures, implementations, variations, and differences. Finally, we discuss some advantages, provide some ideas for future developments, and mention opportunities such as application to viral integrations, translocations, and epigenetics.
Collapse
Affiliation(s)
- Emmanuel Martinez-Ledesma
- Tecnologico de Monterrey, Escuela de Medicina y Ciencias de la Salud, Bioinformática y Diagnóstico Clínico, Monterrey, Nuevo León, Mexico
| | - David Flores
- Tecnologico de Monterrey, Escuela de Medicina y Ciencias de la Salud, Bioinformática y Diagnóstico Clínico, Monterrey, Nuevo León, Mexico
- Universidad del Caribe, Departamento de Ciencias Básicas e Ingenierías, Cancún, Quintana Roo, Mexico
| | - Victor Trevino
- Tecnologico de Monterrey, Escuela de Medicina y Ciencias de la Salud, Bioinformática y Diagnóstico Clínico, Monterrey, Nuevo León, Mexico
| |
Collapse
|
15
|
Khalighi S, Singh S, Varadan V. Untangling a complex web: Computational analyses of tumor molecular profiles to decode driver mechanisms. J Genet Genomics 2020; 47:595-609. [PMID: 33423960 PMCID: PMC7902422 DOI: 10.1016/j.jgg.2020.11.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2020] [Revised: 11/04/2020] [Accepted: 11/14/2020] [Indexed: 12/19/2022]
Abstract
Genome-scale studies focusing on molecular profiling of cancers across tissue types have revealed a plethora of aberrations across the genomic, transcriptomic, and epigenomic scales. The significant molecular heterogeneity across individual tumors even within the same tissue context complicates decoding the key etiologic mechanisms of this disease. Furthermore, it is increasingly likely that biologic mechanisms underlying the pathobiology of cancer involve multiple molecular entities interacting across functional scales. This has motivated the development of computational approaches that integrate molecular measurements with prior biological knowledge in increasingly intricate ways to enable the discovery of driver genomic aberrations across cancers. Here, we review diverse methodological approaches that have powered significant advances in our understanding of the genomic underpinnings of cancer at the cohort and at the individual tumor scales. We outline the key advances and challenges in the computational discovery of cancer mechanisms while motivating the development of systems biology approaches to comprehensively decode the biologic drivers of this complex disease.
Collapse
Affiliation(s)
- Sirvan Khalighi
- Division of General Medical Sciences-Oncology, Case Comprehensive Cancer Center, Case Western Reserve University School of Medicine, Cleveland, OH 44106, USA
| | - Salendra Singh
- Division of General Medical Sciences-Oncology, Case Comprehensive Cancer Center, Case Western Reserve University School of Medicine, Cleveland, OH 44106, USA
| | - Vinay Varadan
- Division of General Medical Sciences-Oncology, Case Comprehensive Cancer Center, Case Western Reserve University School of Medicine, Cleveland, OH 44106, USA.
| |
Collapse
|
16
|
Trevino V. Modeling and analysis of site-specific mutations in cancer identifies known plus putative novel hotspots and bias due to contextual sequences. Comput Struct Biotechnol J 2020; 18:1664-1675. [PMID: 32670506 PMCID: PMC7339035 DOI: 10.1016/j.csbj.2020.06.022] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2019] [Revised: 06/10/2020] [Accepted: 06/12/2020] [Indexed: 11/22/2022] Open
Abstract
In cancer, recurrently mutated sites in DNA and proteins, called hotspots, are thought to be raised by positive selection and therefore important due to its potential functional impact. Although recent evidence for APOBEC enzymatic activity have shown that specific types of sequences are likely to be false, the identification of putative hotspots is important to confirm either its functional role or its mechanistic bias. In this work, an algorithm and a statistical model is presented to detect hotspots. The model consists of a beta-binomial component plus fixed effects that efficiently fits the distribution of mutated sites. The algorithm employs an optimal stepwise approach to find the model parameters. Simulations show that the proposed algorithmic model is highly accurate for common hotspots. The approach has been applied to TCGA mutational data from 33 cancer types. The results show that well-known cancer hotspots are easily detected. Besides, novel hotspots are also detected. An analysis of the sequence context of detected hotspots show a preference for TCG sites that may be related to APOBEC or other unknown mechanistic biases. The detected hotspots are available online in http://bioinformatica.mty.itesm.mx/HotSpotsAnnotations.
Collapse
Affiliation(s)
- Victor Trevino
- Tecnologico de Monterrey, Escuela de Medicina, Av Morones Prieto No. 3000, Colonia Los Doctores, Monterrey, Nuevo León Zip Code 64710, Mexico
| |
Collapse
|
17
|
Guo YA, Chang MM, Skanderup AJ. MutSpot: detection of non-coding mutation hotspots in cancer genomes. NPJ Genom Med 2020; 5:26. [PMID: 32550006 PMCID: PMC7275039 DOI: 10.1038/s41525-020-0133-4] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2019] [Accepted: 05/15/2020] [Indexed: 12/23/2022] Open
Abstract
Recurrence and clustering of somatic mutations (hotspots) in cancer genomes may indicate positive selection and involvement in tumorigenesis. MutSpot performs genome-wide inference of mutation hotspots in non-coding and regulatory DNA of cancer genomes. MutSpot performs feature selection across hundreds of epigenetic and sequence features followed by estimation of position- and patient-specific background somatic mutation probabilities. MutSpot is user-friendly, works on a standard workstation, and scales to thousands of cancer genomes.
Collapse
Affiliation(s)
- Yu Amanda Guo
- Computational and Systems Biology, Agency for Science Technology and Research, Genome Institute of Singapore, 60 Biopolis Street, Singapore, 138672 Singapore
| | - Mei Mei Chang
- Computational and Systems Biology, Agency for Science Technology and Research, Genome Institute of Singapore, 60 Biopolis Street, Singapore, 138672 Singapore
| | - Anders Jacobsen Skanderup
- Computational and Systems Biology, Agency for Science Technology and Research, Genome Institute of Singapore, 60 Biopolis Street, Singapore, 138672 Singapore
| |
Collapse
|
18
|
Shuai S, Gallinger S, Stein LD. Combined burden and functional impact tests for cancer driver discovery using DriverPower. Nat Commun 2020; 11:734. [PMID: 32024818 PMCID: PMC7002750 DOI: 10.1038/s41467-019-13929-1] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2017] [Accepted: 12/09/2019] [Indexed: 12/14/2022] Open
Abstract
The discovery of driver mutations is one of the key motivations for cancer genome sequencing. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2658 cancers across 38 tumour types, we describe DriverPower, a software package that uses mutational burden and functional impact evidence to identify driver mutations in coding and non-coding sites within cancer whole genomes. Using a total of 1373 genomic features derived from public sources, DriverPower's background mutation model explains up to 93% of the regional variance in the mutation rate across multiple tumour types. By incorporating functional impact scores, we are able to further increase the accuracy of driver discovery. Testing across a collection of 2583 cancer genomes from the PCAWG project, DriverPower identifies 217 coding and 95 non-coding driver candidates. Comparing to six published methods used by the PCAWG Drivers and Functional Interpretation Working Group, DriverPower has the highest F1 score for both coding and non-coding driver discovery. This demonstrates that DriverPower is an effective framework for computational driver discovery.
Collapse
Affiliation(s)
- Shimin Shuai
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada, M5S 1A8.
- Computational Biology Program, Ontario Institute for Cancer Research, Toronto, ON, Canada, M5G 0A3.
| | - Steven Gallinger
- Division of General Surgery, Toronto General Hospital, Toronto, ON, Canada, M5G 2C4
- Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, ON, Canada, M5G 1X5
| | - Lincoln D Stein
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada, M5S 1A8.
- Computational Biology Program, Ontario Institute for Cancer Research, Toronto, ON, Canada, M5G 0A3.
| |
Collapse
|
19
|
Carlevaro-Fita J, Lanzós A, Feuerbach L, Hong C, Mas-Ponte D, Pedersen JS, Johnson R. Cancer LncRNA Census reveals evidence for deep functional conservation of long noncoding RNAs in tumorigenesis. Commun Biol 2020; 3:56. [PMID: 32024996 PMCID: PMC7002399 DOI: 10.1038/s42003-019-0741-7] [Citation(s) in RCA: 139] [Impact Index Per Article: 27.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2018] [Accepted: 08/31/2018] [Indexed: 12/14/2022] Open
Abstract
Long non-coding RNAs (lncRNAs) are a growing focus of cancer genomics studies, creating the need for a resource of lncRNAs with validated cancer roles. Furthermore, it remains debated whether mutated lncRNAs can drive tumorigenesis, and whether such functions could be conserved during evolution. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, we introduce the Cancer LncRNA Census (CLC), a compilation of 122 GENCODE lncRNAs with causal roles in cancer phenotypes. In contrast to existing databases, CLC requires strong functional or genetic evidence. CLC genes are enriched amongst driver genes predicted from somatic mutations, and display characteristic genomic features. Strikingly, CLC genes are enriched for driver mutations from unbiased, genome-wide transposon-mutagenesis screens in mice. We identified 10 tumour-causing mutations in orthologues of 8 lncRNAs, including LINC-PINT and NEAT1, but not MALAT1. Thus CLC represents a dataset of high-confidence cancer lncRNAs. Mutagenesis maps are a novel means for identifying deeply-conserved roles of lncRNAs in tumorigenesis.
Collapse
Affiliation(s)
- Joana Carlevaro-Fita
- Department of Medical Oncology, Inselspital, University Hospital and University of Bern, 3010, Bern, Switzerland
- Department of Biomedical Research, University of Bern, 3008, Bern, Switzerland
- Graduate School for Cellular and Biomedical Sciences, University of Bern, 3012, Bern, Switzerland
| | - Andrés Lanzós
- Department of Medical Oncology, Inselspital, University Hospital and University of Bern, 3010, Bern, Switzerland
- Department of Biomedical Research, University of Bern, 3008, Bern, Switzerland
- Graduate School for Cellular and Biomedical Sciences, University of Bern, 3012, Bern, Switzerland
| | - Lars Feuerbach
- Applied Bioinformatics, Deutsches Krebsforschungszentrum, 69120, Heidelberg, Germany
| | - Chen Hong
- Applied Bioinformatics, Deutsches Krebsforschungszentrum, 69120, Heidelberg, Germany
| | - David Mas-Ponte
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona, 08003, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Institut Hospital del Mar d'Investigacions Mèdiques (IMIM), Dr. Aiguader 88, 08003, Barcelona, Spain
| | - Jakob Skou Pedersen
- Department for Molecular Medicine, Aarhus University Hospital, Palle Juul-Jensens Boulevard 99, 8200, Aarhus N, Denmark
| | - Rory Johnson
- Department of Medical Oncology, Inselspital, University Hospital and University of Bern, 3010, Bern, Switzerland.
- Department of Biomedical Research, University of Bern, 3008, Bern, Switzerland.
- Graduate School for Cellular and Biomedical Sciences, University of Bern, 3012, Bern, Switzerland.
| |
Collapse
|
20
|
Rheinbay E, Nielsen MM, Abascal F, Wala JA, Shapira O, Tiao G, Hornshøj H, Hess JM, Juul RI, Lin Z, Feuerbach L, Sabarinathan R, Madsen T, Kim J, Mularoni L, Shuai S, Lanzós A, Herrmann C, Maruvka YE, Shen C, Amin SB, Bandopadhayay P, Bertl J, Boroevich KA, Busanovich J, Carlevaro-Fita J, Chakravarty D, Chan CWY, Craft D, Dhingra P, Diamanti K, Fonseca NA, Gonzalez-Perez A, Guo Q, Hamilton MP, Haradhvala NJ, Hong C, Isaev K, Johnson TA, Juul M, Kahles A, Kahraman A, Kim Y, Komorowski J, Kumar K, Kumar S, Lee D, Lehmann KV, Li Y, Liu EM, Lochovsky L, Park K, Pich O, Roberts ND, Saksena G, Schumacher SE, Sidiropoulos N, Sieverling L, Sinnott-Armstrong N, Stewart C, Tamborero D, Tubio JMC, Umer HM, Uusküla-Reimand L, Wadelius C, Wadi L, Yao X, Zhang CZ, Zhang J, Haber JE, Hobolth A, Imielinski M, Kellis M, Lawrence MS, von Mering C, Nakagawa H, Raphael BJ, Rubin MA, Sander C, Stein LD, Stuart JM, Tsunoda T, Wheeler DA, Johnson R, Reimand J, Gerstein M, Khurana E, Campbell PJ, López-Bigas N, Weischenfeldt J, Beroukhim R, Martincorena I, Pedersen JS, Getz G. Analyses of non-coding somatic drivers in 2,658 cancer whole genomes. Nature 2020; 578:102-111. [PMID: 32025015 PMCID: PMC7054214 DOI: 10.1038/s41586-020-1965-x] [Citation(s) in RCA: 402] [Impact Index Per Article: 80.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2018] [Accepted: 12/02/2019] [Indexed: 01/28/2023]
Abstract
The discovery of drivers of cancer has traditionally focused on protein-coding genes1-4. Here we present analyses of driver point mutations and structural variants in non-coding regions across 2,658 genomes from the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium5 of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA). For point mutations, we developed a statistically rigorous strategy for combining significance levels from multiple methods of driver discovery that overcomes the limitations of individual methods. For structural variants, we present two methods of driver discovery, and identify regions that are significantly affected by recurrent breakpoints and recurrent somatic juxtapositions. Our analyses confirm previously reported drivers6,7, raise doubts about others and identify novel candidates, including point mutations in the 5' region of TP53, in the 3' untranslated regions of NFKBIZ and TOB1, focal deletions in BRD4 and rearrangements in the loci of AKR1C genes. We show that although point mutations and structural variants that drive cancer are less frequent in non-coding genes and regulatory sequences than in protein-coding genes, additional examples of these drivers will be found as more cancer genomes become available.
Collapse
Affiliation(s)
- Esther Rheinbay
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA, USA
- Harvard Medical School, Boston, MA, USA
| | - Morten Muhlig Nielsen
- Department of Molecular Medicine (MOMA), Aarhus University Hospital, Aarhus, Denmark
| | | | - Jeremiah A Wala
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Bioinformatics and Integrative Genomics, Harvard University, Cambridge, MA, USA
| | - Ofer Shapira
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Grace Tiao
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Henrik Hornshøj
- Department of Molecular Medicine (MOMA), Aarhus University Hospital, Aarhus, Denmark
| | - Julian M Hess
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Randi Istrup Juul
- Department of Molecular Medicine (MOMA), Aarhus University Hospital, Aarhus, Denmark
| | - Ziao Lin
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Harvard University, Cambridge, MA, USA
| | - Lars Feuerbach
- Division of Applied Bioinformatics, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Radhakrishnan Sabarinathan
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Spain
- Research Program on Biomedical Informatics, Universitat Pompeu Fabra, Barcelona, Spain
| | - Tobias Madsen
- Department of Molecular Medicine (MOMA), Aarhus University Hospital, Aarhus, Denmark
| | - Jaegil Kim
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Loris Mularoni
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Spain
- Research Program on Biomedical Informatics, Universitat Pompeu Fabra, Barcelona, Spain
| | - Shimin Shuai
- Computational Biology Program, Ontario Institute for Cancer Research, Toronto, Ontario, Canada
- Ontario Institute for Cancer Research, Toronto, Ontario, Canada
| | - Andrés Lanzós
- Department for BioMedical Research, University of Bern, Bern, Switzerland
- Graduate School of Cellular and Biomedical Sciences, University of Bern, Bern, Switzerland
- Department of Medical Oncology, Bern University Hospital, University of Bern, Bern, Switzerland
| | - Carl Herrmann
- Division of Theoretical Bioinformatics, German Cancer Research Center (DKFZ), Heidelberg, Germany
- Bioquant Center, Institute of Pharmacy and Molecular Biotechnology, University of Heidelberg, Heidelberg, Germany
| | - Yosef E Maruvka
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA, USA
| | - Ciyue Shen
- Department of Cell Biology, Harvard Medical School, Boston, MA, USA
- cBio Center, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
| | - Samirkumar B Amin
- Department of Genomic Medicine, University of Texas MD Anderson Cancer Center, Houston, TX, USA
- Graduate Program in Structural and Computational Biology and Molecular Biophysics, Baylor College of Medicine, Houston, TX, USA
| | - Pratiti Bandopadhayay
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Johanna Bertl
- Department of Molecular Medicine (MOMA), Aarhus University Hospital, Aarhus, Denmark
| | - Keith A Boroevich
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - John Busanovich
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Joana Carlevaro-Fita
- Department for BioMedical Research, University of Bern, Bern, Switzerland
- Graduate School of Cellular and Biomedical Sciences, University of Bern, Bern, Switzerland
- Department of Medical Oncology, Bern University Hospital, University of Bern, Bern, Switzerland
| | - Dimple Chakravarty
- Department of Genitourinary Medical Oncology - Research, Division of Cancer Medicine, University of Texas MD Anderson Cancer Center, Houston, TX, USA
- Department of Urology, Icahn school of Medicine at Mount Sinai, New York, NY, USA
| | - Calvin Wing Yiu Chan
- Division of Theoretical Bioinformatics, German Cancer Research Center (DKFZ), Heidelberg, Germany
- Faculty of Biosciences, Heidelberg University, Heidelberg, Germany
| | - David Craft
- Department of Radiation Oncology, Harvard Medical School, Massachusetts General Hospital, Boston, MA, USA
| | - Priyanka Dhingra
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
- Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA
| | - Klev Diamanti
- Science for Life Laboratory, Department of Cell and Molecular Biology, Uppsala University, Uppsala, Sweden
| | - Nuno A Fonseca
- European Bioinformatics Institute, European Molecular Biology Laboratory, Hinxton, UK
| | - Abel Gonzalez-Perez
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Spain
- Research Program on Biomedical Informatics, Universitat Pompeu Fabra, Barcelona, Spain
| | - Qianyun Guo
- Bioinformatics Research Centre (BiRC), Aarhus University, Aarhus, Denmark
| | - Mark P Hamilton
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, TX, USA
| | - Nicholas J Haradhvala
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA, USA
| | - Chen Hong
- Division of Applied Bioinformatics, German Cancer Research Center (DKFZ), Heidelberg, Germany
- Faculty of Biosciences, Heidelberg University, Heidelberg, Germany
| | - Keren Isaev
- Computational Biology Program, Ontario Institute for Cancer Research, Toronto, Ontario, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada
| | - Todd A Johnson
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Malene Juul
- Department of Molecular Medicine (MOMA), Aarhus University Hospital, Aarhus, Denmark
| | - Andre Kahles
- Division of Computational Biology, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Abdullah Kahraman
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, Zurich, Switzerland
| | - Youngwook Kim
- Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, South Korea
| | - Jan Komorowski
- Science for Life Laboratory, Department of Cell and Molecular Biology, Uppsala University, Uppsala, Sweden
- Institute of Computer Science, Polish Academy of Sciences, Warsaw, Poland
| | - Kiran Kumar
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Sushant Kumar
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA
| | - Donghoon Lee
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA
| | - Kjong-Van Lehmann
- Division of Computational Biology, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Yilong Li
- SBGD Inc, Cambridge, MA, USA
- Department of Haematology, University of Cambridge, Cambridge, UK
| | - Eric Minwei Liu
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
- Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA
| | - Lucas Lochovsky
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | - Keunchil Park
- Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, South Korea
| | - Oriol Pich
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Spain
- Research Program on Biomedical Informatics, Universitat Pompeu Fabra, Barcelona, Spain
| | - Nicola D Roberts
- Department of Haematology, University of Cambridge, Cambridge, UK
| | - Gordon Saksena
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Steven E Schumacher
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Nikos Sidiropoulos
- Biotech Research & Innovation Centre (BRIC), The Finsen Laboratory, Rigshospitalet, University of Copenhagen, Copenhagen, Denmark
| | - Lina Sieverling
- Division of Applied Bioinformatics, German Cancer Research Center (DKFZ), Heidelberg, Germany
- Faculty of Biosciences, Heidelberg University, Heidelberg, Germany
| | | | - Chip Stewart
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - David Tamborero
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Spain
- Research Program on Biomedical Informatics, Universitat Pompeu Fabra, Barcelona, Spain
| | - Jose M C Tubio
- Department of Zoology, Genetics and Physical Anthropology, Universidade de Santiago de Compostela, Santiago de Compostela, Spain
- Centre for Research in Molecular Medicine and Chronic Diseases (CIMUS), Universidade de Santiago de Compostela, Santiago de Compostela, Spain
- The Biomedical Research Centre (CINBIO), Universidade de Vigo, Vigo, Spain
| | - Husen M Umer
- Science for Life Laboratory, Department of Cell and Molecular Biology, Uppsala University, Uppsala, Sweden
- Department of Oncology-Pathology, Science for Life Laboratory, Karolinska Institute, Stockholm, Sweden
| | - Liis Uusküla-Reimand
- Genetics and Genome Biology Program, SickKids Research Institute, Toronto, Ontario, Canada
- Department of Gene Technology, Tallinn University of Technology, Tallinn, Estonia
| | - Claes Wadelius
- Science for Life Laboratory, Department of Immunology, Genetics and Pathology, Uppsala University, Uppsala, Sweden
| | - Lina Wadi
- Computational Biology Program, Ontario Institute for Cancer Research, Toronto, Ontario, Canada
| | | | - Cheng-Zhong Zhang
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Jing Zhang
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA
| | - James E Haber
- Department of Biology and Rosenstiel Basic Medical Sciences Research Center, Brandeis University, Waltham, MA, USA
| | - Asger Hobolth
- Bioinformatics Research Centre (BiRC), Aarhus University, Aarhus, Denmark
| | - Marcin Imielinski
- New York Genome Center, New York, NY, USA
- Department of Pathology and Laboratory Medicine, and Englander Institute for Precision Medicine, and Institute for Computational Biomedicine, and Meyer Cancer Center, Weill Cornell Medicine, New York, NY, USA
| | - Manolis Kellis
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA, USA
| | - Michael S Lawrence
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA, USA
| | - Christian von Mering
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, Zurich, Switzerland
| | - Hidewaki Nakagawa
- Laboratory for Cancer Genomics, RIKEN Center for Integrative Medical Sciences, Tokyo, Japan
| | - Benjamin J Raphael
- Department of Computer Science, Princeton University, Princeton, NJ, USA
| | - Mark A Rubin
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA
- Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY, USA
- Meyer Cancer Center, Weill Cornell Medicine, New York, NY, USA
| | - Chris Sander
- Department of Cell Biology, Harvard Medical School, Boston, MA, USA
- cBio Center, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
| | - Lincoln D Stein
- Computational Biology Program, Ontario Institute for Cancer Research, Toronto, Ontario, Canada
- Ontario Institute for Cancer Research, Toronto, Ontario, Canada
| | - Joshua M Stuart
- Center for Biomolecular Science and Engineering, University of California at Santa Cruz, Santa Cruz, CA, USA
| | - Tatsuhiko Tsunoda
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Department of Medical Science Mathematics, Medical Research Institute, Tokyo Medical and Dental University, Tokyo, Japan
- Laboratory for Medical Science Mathematics, Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Tokyo, Japan
| | - David A Wheeler
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - Rory Johnson
- Department for BioMedical Research, University of Bern, Bern, Switzerland
- Department of Medical Oncology, Bern University Hospital, University of Bern, Bern, Switzerland
| | - Jüri Reimand
- Computational Biology Program, Ontario Institute for Cancer Research, Toronto, Ontario, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada
| | - Mark Gerstein
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
- Department of Computer Science, Yale University, New Haven, CT, USA
| | - Ekta Khurana
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
- Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA
- Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY, USA
- Meyer Cancer Center, Weill Cornell Medicine, New York, NY, USA
| | - Peter J Campbell
- Wellcome Trust Sanger Institute, Hinxton, UK
- Department of Haematology, University of Cambridge, Cambridge, UK
| | - Núria López-Bigas
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Spain
- Research Program on Biomedical Informatics, Universitat Pompeu Fabra, Barcelona, Spain
- Catalan Institution for Research and Advanced Studies (ICREA), Barcelona, Spain
| | - Joachim Weischenfeldt
- Biotech Research & Innovation Centre (BRIC), The Finsen Laboratory, Rigshospitalet, University of Copenhagen, Copenhagen, Denmark.
- Genome Biology Unit, European Molecular Biology Laboratory (EMBL), Heidelberg, Germany.
| | - Rameen Beroukhim
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Bioinformatics and Integrative Genomics, Harvard University, Cambridge, MA, USA.
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA.
| | | | - Jakob Skou Pedersen
- Department of Molecular Medicine (MOMA), Aarhus University Hospital, Aarhus, Denmark.
- Bioinformatics Research Centre (BiRC), Aarhus University, Aarhus, Denmark.
| | - Gad Getz
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA, USA.
- Harvard Medical School, Boston, MA, USA.
- Department of Pathology, Massachusetts General Hospital, Boston, MA, USA.
| |
Collapse
|
21
|
Zhu H, Uusküla-Reimand L, Isaev K, Wadi L, Alizada A, Shuai S, Huang V, Aduluso-Nwaobasi D, Paczkowska M, Abd-Rabbo D, Ocsenas O, Liang M, Thompson JD, Li Y, Ruan L, Krassowski M, Dzneladze I, Simpson JT, Lupien M, Stein LD, Boutros PC, Wilson MD, Reimand J. Candidate Cancer Driver Mutations in Distal Regulatory Elements and Long-Range Chromatin Interaction Networks. Mol Cell 2020; 77:1307-1321.e10. [PMID: 31954095 DOI: 10.1016/j.molcel.2019.12.027] [Citation(s) in RCA: 48] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2019] [Revised: 06/04/2019] [Accepted: 12/24/2019] [Indexed: 12/17/2022]
Abstract
A comprehensive catalog of cancer driver mutations is essential for understanding tumorigenesis and developing therapies. Exome-sequencing studies have mapped many protein-coding drivers, yet few non-coding drivers are known because genome-wide discovery is challenging. We developed a driver discovery method, ActiveDriverWGS, and analyzed 120,788 cis-regulatory modules (CRMs) across 1,844 whole tumor genomes from the ICGC-TCGA PCAWG project. We found 30 CRMs with enriched SNVs and indels (FDR < 0.05). These frequently mutated regulatory elements (FMREs) were ubiquitously active in human tissues, showed long-range chromatin interactions and mRNA abundance associations with target genes, and were enriched in motif-rewiring mutations and structural variants. Genomic deletion of one FMRE in human cells caused proliferative deficiencies and transcriptional deregulation of cancer genes CCNB1IP1, CDH1, and CDKN2B, validating observations in FMRE-mutated tumors. Pathway analysis revealed further sub-significant FMREs at cancer genes and processes, indicating an unexplored landscape of infrequent driver mutations in the non-coding genome.
Collapse
Affiliation(s)
- Helen Zhu
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada; Department of Medical Biophysics, University of Toronto, 101 College Street Suite 15-701, Toronto, ON M5G 1L7, Canada
| | - Liis Uusküla-Reimand
- Program in Genetics and Genome Biology, SickKids Research Institute, Peter Gilgan Centre for Research and Learning (PGCRL), 686 Bay Street, Toronto, ON M5G 0A4, Canada; Division of Gene Technology, Department of Chemistry and Biotechnology, Tallinn University of Technology, Akadeemia tee 15, Tallinn 12618, Estonia
| | - Keren Isaev
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada; Department of Medical Biophysics, University of Toronto, 101 College Street Suite 15-701, Toronto, ON M5G 1L7, Canada
| | - Lina Wadi
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada
| | - Azad Alizada
- Program in Genetics and Genome Biology, SickKids Research Institute, Peter Gilgan Centre for Research and Learning (PGCRL), 686 Bay Street, Toronto, ON M5G 0A4, Canada
| | - Shimin Shuai
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada; Department of Molecular Genetics, University of Toronto, 1 King's College Circle, Toronto, ON M5S 1A8, Canada
| | - Vincent Huang
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada
| | - Dike Aduluso-Nwaobasi
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada
| | - Marta Paczkowska
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada
| | - Diala Abd-Rabbo
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada
| | - Oliver Ocsenas
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada; Department of Medical Biophysics, University of Toronto, 101 College Street Suite 15-701, Toronto, ON M5G 1L7, Canada
| | - Minggao Liang
- Program in Genetics and Genome Biology, SickKids Research Institute, Peter Gilgan Centre for Research and Learning (PGCRL), 686 Bay Street, Toronto, ON M5G 0A4, Canada; Department of Molecular Genetics, University of Toronto, 1 King's College Circle, Toronto, ON M5S 1A8, Canada
| | - J Drew Thompson
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada
| | - Yao Li
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada
| | - Luyao Ruan
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada
| | - Michal Krassowski
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada
| | - Irakli Dzneladze
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada
| | - Jared T Simpson
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada; Department of Computer Science, University of Toronto, 214 College Street, Toronto, ON M5T 3A1, Canada
| | - Mathieu Lupien
- Department of Medical Biophysics, University of Toronto, 101 College Street Suite 15-701, Toronto, ON M5G 1L7, Canada; Princess Margaret Cancer Centre, 101 College Street, Toronto, ON M5G 0A3, Canada
| | - Lincoln D Stein
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada; Department of Molecular Genetics, University of Toronto, 1 King's College Circle, Toronto, ON M5S 1A8, Canada
| | - Paul C Boutros
- Department of Medical Biophysics, University of Toronto, 101 College Street Suite 15-701, Toronto, ON M5G 1L7, Canada; Department of Human Genetics, University of California Los Angeles, 10833 Le Conte Avenue, Los Angeles, CA 90095, USA; Department of Urology, University of California Los Angeles, 200 Medical Plaza Driveway #140, Los Angeles, CA 90024, USA; Institute of Precision Health, University of California Los Angeles, 10833 Le Conte Avenue, Los Angeles, CA 90024, USA; Jonsson Comprehensive Cancer Centre, University of California Los Angeles, 10833 Le Conte Avenue, Los Angeles, CA 90024, USA
| | - Michael D Wilson
- Program in Genetics and Genome Biology, SickKids Research Institute, Peter Gilgan Centre for Research and Learning (PGCRL), 686 Bay Street, Toronto, ON M5G 0A4, Canada; Department of Molecular Genetics, University of Toronto, 1 King's College Circle, Toronto, ON M5S 1A8, Canada
| | - Jüri Reimand
- Computational Biology Program, Ontario Institute for Cancer Research, 661 University Avenue Suite 510, Toronto, ON M5G 0A3, Canada; Department of Medical Biophysics, University of Toronto, 101 College Street Suite 15-701, Toronto, ON M5G 1L7, Canada.
| |
Collapse
|
22
|
Nath A, Lau EYT, Lee AM, Geeleher P, Cho WCS, Huang RS. Discovering long noncoding RNA predictors of anticancer drug sensitivity beyond protein-coding genes. Proc Natl Acad Sci U S A 2019; 116:22020-22029. [PMID: 31548386 PMCID: PMC6825320 DOI: 10.1073/pnas.1909998116] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
Large-scale cancer cell line screens have identified thousands of protein-coding genes (PCGs) as biomarkers of anticancer drug response. However, systematic evaluation of long noncoding RNAs (lncRNAs) as pharmacogenomic biomarkers has so far proven challenging. Here, we study the contribution of lncRNAs as drug response predictors beyond spurious associations driven by correlations with proximal PCGs, tissue lineage, or established biomarkers. We show that, as a whole, the lncRNA transcriptome is equally potent as the PCG transcriptome at predicting response to hundreds of anticancer drugs. Analysis of individual lncRNAs transcripts associated with drug response reveals nearly half of the significant associations are in fact attributable to proximal cis-PCGs. However, adjusting for effects of cis-PCGs revealed significant lncRNAs that augment drug response predictions for most drugs, including those with well-established clinical biomarkers. In addition, we identify lncRNA-specific somatic alterations associated with drug response by adopting a statistical approach to determine lncRNAs carrying somatic mutations that undergo positive selection in cancer cells. Lastly, we experimentally demonstrate that 2 lncRNAs, EGFR-AS1 and MIR205HG, are functionally relevant predictors of anti-epidermal growth factor receptor (EGFR) drug response.
Collapse
Affiliation(s)
- Aritro Nath
- Department of Experimental and Clinical Pharmacology, University of Minnesota Minneapolis, MN 55455
| | - Eunice Y T Lau
- Department of Clinical Oncology, Queen Elizabeth Hospital, Hong Kong SAR, China
| | - Adam M Lee
- Department of Experimental and Clinical Pharmacology, University of Minnesota Minneapolis, MN 55455
| | - Paul Geeleher
- Department of Computational Biology, St. Jude Children's Research Hospital Memphis, TN 38105
| | - William C S Cho
- Department of Clinical Oncology, Queen Elizabeth Hospital, Hong Kong SAR, China
| | - R Stephanie Huang
- Department of Experimental and Clinical Pharmacology, University of Minnesota Minneapolis, MN 55455;
| |
Collapse
|
23
|
Juul M, Madsen T, Guo Q, Bertl J, Hobolth A, Kellis M, Pedersen JS. ncdDetect2: improved models of the site-specific mutation rate in cancer and driver detection with robust significance evaluation. Bioinformatics 2019; 35:189-199. [PMID: 29945188 PMCID: PMC6330011 DOI: 10.1093/bioinformatics/bty511] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2018] [Accepted: 06/24/2018] [Indexed: 01/22/2023] Open
Abstract
Motivation Understanding the mutational processes that act during cancer development is a key topic of cancer biology. Nevertheless, much remains to be learned, as a complex interplay of processes with dependencies on a range of genomic features creates highly heterogeneous cancer genomes. Accurate driver detection relies on unbiased models of the mutation rate that also capture rate variation from uncharacterized sources. Results Here, we analyse patterns of observed-to-expected mutation counts across 505 whole cancer genomes, and find that genomic features missing from our mutation-rate model likely operate on a megabase length scale. We extend our site-specific model of the mutation rate to include the additional variance from these sources, which leads to robust significance evaluation of candidate cancer drivers. We thus present ncdDetect v.2, with greatly improved cancer driver detection specificity. Finally, we show that ranking candidates by their posterior mean value of their effect sizes offers an equivalent and more computationally efficient alternative to ranking by their P-values. Availability and implementation ncdDetect v.2 is implemented as an R-package and is freely available at http://github.com/TobiasMadsen/ncdDetect2 Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Malene Juul
- Department of Molecular Medicine, Aarhus University, Palle Juul-Jensens Boulevard 99, DK-8200 Aarhus N, Denmark.,Bioinformatics Research Centre, Aarhus University, C.F. Mollers Alle 8, DK-8000 Aarhus C, Denmark
| | - Tobias Madsen
- Department of Molecular Medicine, Aarhus University, Palle Juul-Jensens Boulevard 99, DK-8200 Aarhus N, Denmark.,Bioinformatics Research Centre, Aarhus University, C.F. Mollers Alle 8, DK-8000 Aarhus C, Denmark
| | - Qianyun Guo
- Bioinformatics Research Centre, Aarhus University, C.F. Mollers Alle 8, DK-8000 Aarhus C, Denmark
| | - Johanna Bertl
- Department of Molecular Medicine, Aarhus University, Palle Juul-Jensens Boulevard 99, DK-8200 Aarhus N, Denmark
| | - Asger Hobolth
- Bioinformatics Research Centre, Aarhus University, C.F. Mollers Alle 8, DK-8000 Aarhus C, Denmark
| | - Manolis Kellis
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology (MIT), Cambridge, MA, USA
| | - Jakob Skou Pedersen
- Department of Molecular Medicine, Aarhus University, Palle Juul-Jensens Boulevard 99, DK-8200 Aarhus N, Denmark.,Bioinformatics Research Centre, Aarhus University, C.F. Mollers Alle 8, DK-8000 Aarhus C, Denmark
| |
Collapse
|
24
|
Sharma A, Jiang C, De S. Dissecting the sources of gene expression variation in a pan-cancer analysis identifies novel regulatory mutations. Nucleic Acids Res 2019; 46:4370-4381. [PMID: 29672706 PMCID: PMC5961375 DOI: 10.1093/nar/gky271] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2017] [Accepted: 03/29/2018] [Indexed: 02/07/2023] Open
Abstract
Although the catalog of cancer-associated mutations in protein-coding regions is nearly complete for all major cancer types, an assessment of regulatory changes in cancer genomes and their clinical significance remain largely preliminary. Adopting bottom-up approach, we quantify the effects of different sources of gene expression variation in a cohort of 3899 samples from 10 cancer types. We find that copy number alterations, epigenetic changes, transcription factors and microRNAs collectively explain, on average, only 31–38% and 18–26% expression variation for cancer-associated and other genes, respectively, and that among these factors copy number alteration has the highest effect. We show that the genes with systematic, large expression variation that could not be attributed to these factors are enriched for pathways related to cancer hallmarks. Integrating whole genome sequencing data and focusing on genes with systematic expression variation we identify novel, recurrent regulatory mutations affecting known cancer genes such as NKX2-1 and GRIN2D in multiple cancer types. Nonetheless, at a genome-wide scale proportions of gene expression variation attributed to recurrent point mutations appear to be modest so far, especially when compared to that attributed to copy number changes – a pattern different from that observed for other complex diseases and traits. We suspect that, owing to plasticity and redundancy in biological pathways, regulatory alterations show complex combinatorial patterns, modulating gene expression in cancer genomes at a finer scale.
Collapse
Affiliation(s)
- Anchal Sharma
- Center for Systems and Computational Biology, Rutgers Cancer Institute of New Jersey, Rutgers the State University of New Jersey. New Brunswick, NJ 08901, USA
| | - Chuan Jiang
- Center for Systems and Computational Biology, Rutgers Cancer Institute of New Jersey, Rutgers the State University of New Jersey. New Brunswick, NJ 08901, USA
| | - Subhajyoti De
- Center for Systems and Computational Biology, Rutgers Cancer Institute of New Jersey, Rutgers the State University of New Jersey. New Brunswick, NJ 08901, USA
| |
Collapse
|
25
|
Liu EM, Martinez-Fundichely A, Diaz BJ, Aronson B, Cuykendall T, MacKay M, Dhingra P, Wong EWP, Chi P, Apostolou E, Sanjana NE, Khurana E. Identification of Cancer Drivers at CTCF Insulators in 1,962 Whole Genomes. Cell Syst 2019; 8:446-455.e8. [PMID: 31078526 PMCID: PMC6917527 DOI: 10.1016/j.cels.2019.04.001] [Citation(s) in RCA: 52] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2017] [Revised: 11/20/2018] [Accepted: 04/02/2019] [Indexed: 12/15/2022]
Abstract
Recent studies have shown that mutations at non-coding elements, such as promoters and enhancers, can act as cancer drivers. However, an important class of non-coding elements, namely CTCF insulators, has been overlooked in the previous driver analyses. We used insulator annotations from CTCF and cohesin ChIA-PET and analyzed somatic mutations in 1,962 whole genomes from 21 cancer types. Using the heterogeneous patterns of transcription-factor-motif disruption, functional impact, and recurrence of mutations, we developed a computational method that revealed 21 insulators showing signals of positive selection. In particular, mutations in an insulator in multiple cancer types, including 16% of melanoma samples, are associated with TGFB1 up-regulation. Using CRISPR-Cas9, we find that alterations at two of the most frequently mutated regions in this insulator increase cell growth by 40%-50%, supporting the role of this boundary element as a cancer driver. Thus, our study reveals several CTCF insulators as putative cancer drivers.
Collapse
Affiliation(s)
- Eric Minwei Liu
- Meyer Cancer Center, Weill Cornell Medicine, New York, NY 10065, USA; Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY 10065, USA; Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY 10021, USA
| | - Alexander Martinez-Fundichely
- Meyer Cancer Center, Weill Cornell Medicine, New York, NY 10065, USA; Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY 10065, USA; Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY 10021, USA
| | - Bianca Jay Diaz
- New York Genome Center, New York, NY 10013, USA; Department of Biology, New York University, New York, NY 10003, USA
| | - Boaz Aronson
- Meyer Cancer Center, Weill Cornell Medicine, New York, NY 10065, USA; Department of Medicine, Weill Cornell Medicine, New York, NY 10021, USA
| | - Tawny Cuykendall
- Meyer Cancer Center, Weill Cornell Medicine, New York, NY 10065, USA; Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY 10065, USA; Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY 10021, USA
| | - Matthew MacKay
- Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY 10021, USA
| | - Priyanka Dhingra
- Meyer Cancer Center, Weill Cornell Medicine, New York, NY 10065, USA; Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY 10065, USA; Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY 10021, USA
| | - Elissa W P Wong
- Human Oncology and Pathogenesis Program, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Ping Chi
- Department of Medicine, Weill Cornell Medicine, New York, NY 10021, USA; Human Oncology and Pathogenesis Program, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA; Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Effie Apostolou
- Meyer Cancer Center, Weill Cornell Medicine, New York, NY 10065, USA; Department of Medicine, Weill Cornell Medicine, New York, NY 10021, USA
| | - Neville E Sanjana
- New York Genome Center, New York, NY 10013, USA; Department of Biology, New York University, New York, NY 10003, USA
| | - Ekta Khurana
- Meyer Cancer Center, Weill Cornell Medicine, New York, NY 10065, USA; Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY 10065, USA; Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY 10021, USA; Caryl and Israel Englander Institute for Precision Medicine, New York Presbyterian Hospital, Weill Cornell Medicine, New York, NY 10065, USA.
| |
Collapse
|
26
|
Deng Y, Luo S, Zhang X, Zou C, Yuan H, Liao G, Xu L, Deng C, Lan Y, Zhao T, Gao X, Xiao Y, Li X. A pan-cancer atlas of cancer hallmark-associated candidate driver lncRNAs. Mol Oncol 2018; 12:1980-2005. [PMID: 30216655 PMCID: PMC6210054 DOI: 10.1002/1878-0261.12381] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2018] [Revised: 07/21/2018] [Accepted: 09/03/2018] [Indexed: 12/12/2022] Open
Abstract
Substantial cancer genome sequencing efforts have discovered many important driver genes contributing to tumorigenesis. However, very little is known about the genetic alterations of long non‐coding RNAs (lncRNAs) in cancer. Thus, there is a need for systematic surveys of driver lncRNAs. Through integrative analysis of 5918 tumors across 11 cancer types, we revealed that lncRNAs have undergone dramatic genomic alterations, many of which are mutually exclusive with well‐known cancer genes. Using the hypothesis of functional redundancy of mutual exclusivity, we developed a computational framework to identify driver lncRNAs associated with different cancer hallmarks. Applying it to pan‐cancer data, we identified 378 candidate driver lncRNAs whose genomic features highly resemble the known cancer driver genes (e.g. high conservation and early replication). We further validated the candidate driver lncRNAs involved in ‘Tissue Invasion and Metastasis’ in lung adenocarcinoma and breast cancer, and also highlighted their potential roles in improving clinical outcomes. In summary, we have generated a comprehensive landscape of cancer candidate driver lncRNAs that could act as a starting point for future functional explorations, as well as the identification of biomarkers and lncRNA‐based target therapy.
Collapse
Affiliation(s)
- Yulan Deng
- College of Bioinformatics Science and Technology, Harbin Medical University, China
| | - Shangyi Luo
- College of Bioinformatics Science and Technology, Harbin Medical University, China
| | - Xinxin Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, China
| | - Chaoxia Zou
- Department of Biochemistry and Molecular Biology, Harbin Medical University, China
| | - Huating Yuan
- College of Bioinformatics Science and Technology, Harbin Medical University, China
| | - Gaoming Liao
- College of Bioinformatics Science and Technology, Harbin Medical University, China
| | - Liwen Xu
- College of Bioinformatics Science and Technology, Harbin Medical University, China
| | - Chunyu Deng
- College of Bioinformatics Science and Technology, Harbin Medical University, China
| | - Yujia Lan
- College of Bioinformatics Science and Technology, Harbin Medical University, China
| | - Tingting Zhao
- Department of Neurology, The First Affiliated Hospital of Harbin Medical University, China
| | - Xu Gao
- Department of Biochemistry and Molecular Biology, Harbin Medical University, China
| | - Yun Xiao
- College of Bioinformatics Science and Technology, Harbin Medical University, China.,Key Laboratory of Cardiovascular Medicine Research, Harbin Medical University, Ministry of Education, China
| | - Xia Li
- College of Bioinformatics Science and Technology, Harbin Medical University, China.,Key Laboratory of Cardiovascular Medicine Research, Harbin Medical University, Ministry of Education, China
| |
Collapse
|
27
|
Uszczynska-Ratajczak B, Lagarde J, Frankish A, Guigó R, Johnson R. Towards a complete map of the human long non-coding RNA transcriptome. Nat Rev Genet 2018; 19:535-548. [PMID: 29795125 PMCID: PMC6451964 DOI: 10.1038/s41576-018-0017-y] [Citation(s) in RCA: 420] [Impact Index Per Article: 60.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Gene maps, or annotations, enable us to navigate the functional landscape of our genome. They are a resource upon which virtually all studies depend, from single-gene to genome-wide scales and from basic molecular biology to medical genetics. Yet present-day annotations suffer from trade-offs between quality and size, with serious but often unappreciated consequences for downstream studies. This is particularly true for long non-coding RNAs (lncRNAs), which are poorly characterized compared to protein-coding genes. Long-read sequencing technologies promise to improve current annotations, paving the way towards a complete annotation of lncRNAs expressed throughout a human lifetime.
Collapse
Affiliation(s)
| | - Julien Lagarde
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Catalonia, Spain
| | - Adam Frankish
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Roderic Guigó
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Catalonia, Spain
| | - Rory Johnson
- Department of Medical Oncology, Inselspital, University Hospital and University of Bern, Bern, Switzerland.
- Department of Biomedical Research (DBMR), University of Bern, Bern, Switzerland.
| |
Collapse
|
28
|
Bertl J, Guo Q, Juul M, Besenbacher S, Nielsen MM, Hornshøj H, Pedersen JS, Hobolth A. A site specific model and analysis of the neutral somatic mutation rate in whole-genome cancer data. BMC Bioinformatics 2018; 19:147. [PMID: 29673314 PMCID: PMC5909259 DOI: 10.1186/s12859-018-2141-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2017] [Accepted: 03/27/2018] [Indexed: 01/02/2023] Open
Abstract
Background Detailed modelling of the neutral mutational process in cancer cells is crucial for identifying driver mutations and understanding the mutational mechanisms that act during cancer development. The neutral mutational process is very complex: whole-genome analyses have revealed that the mutation rate differs between cancer types, between patients and along the genome depending on the genetic and epigenetic context. Therefore, methods that predict the number of different types of mutations in regions or specific genomic elements must consider local genomic explanatory variables. A major drawback of most methods is the need to average the explanatory variables across the entire region or genomic element. This procedure is particularly problematic if the explanatory variable varies dramatically in the element under consideration. Results To take into account the fine scale of the explanatory variables, we model the probabilities of different types of mutations for each position in the genome by multinomial logistic regression. We analyse 505 cancer genomes from 14 different cancer types and compare the performance in predicting mutation rate for both regional based models and site-specific models. We show that for 1000 randomly selected genomic positions, the site-specific model predicts the mutation rate much better than regional based models. We use a forward selection procedure to identify the most important explanatory variables. The procedure identifies site-specific conservation (phyloP), replication timing, and expression level as the best predictors for the mutation rate. Finally, our model confirms and quantifies certain well-known mutational signatures. Conclusion We find that our site-specific multinomial regression model outperforms the regional based models. The possibility of including genomic variables on different scales and patient specific variables makes it a versatile framework for studying different mutational mechanisms. Our model can serve as the neutral null model for the mutational process; regions that deviate from the null model are candidates for elements that drive cancer development. Electronic supplementary material The online version of this article (10.1186/s12859-018-2141-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Johanna Bertl
- Department of Molecular Medicine, Aarhus University, Palle Juul-Jensens Boulevard 99, Aarhus N, DK-8200, Denmark.
| | - Qianyun Guo
- Department of Molecular Medicine, Aarhus University, Palle Juul-Jensens Boulevard 99, Aarhus N, DK-8200, Denmark
| | - Malene Juul
- Bioinformatics Research Centre, Aarhus University, C.F. Mollers Alle 8, Aarhus C, DK-8000, Denmark
| | - Søren Besenbacher
- Department of Molecular Medicine, Aarhus University, Palle Juul-Jensens Boulevard 99, Aarhus N, DK-8200, Denmark
| | - Morten Muhlig Nielsen
- Department of Molecular Medicine, Aarhus University, Palle Juul-Jensens Boulevard 99, Aarhus N, DK-8200, Denmark
| | - Henrik Hornshøj
- Department of Molecular Medicine, Aarhus University, Palle Juul-Jensens Boulevard 99, Aarhus N, DK-8200, Denmark
| | - Jakob Skou Pedersen
- Department of Molecular Medicine, Aarhus University, Palle Juul-Jensens Boulevard 99, Aarhus N, DK-8200, Denmark
| | - Asger Hobolth
- Department of Molecular Medicine, Aarhus University, Palle Juul-Jensens Boulevard 99, Aarhus N, DK-8200, Denmark
| |
Collapse
|
29
|
Singh B, Trincado JL, Tatlow PJ, Piccolo SR, Eyras E. Genome Sequencing and RNA-Motif Analysis Reveal Novel Damaging Noncoding Mutations in Human Tumors. Mol Cancer Res 2018; 16:1112-1124. [DOI: 10.1158/1541-7786.mcr-17-0601] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2017] [Revised: 01/26/2018] [Accepted: 03/16/2018] [Indexed: 11/16/2022]
|
30
|
Circular RNA expression is abundant and correlated to aggressiveness in early-stage bladder cancer. NPJ Genom Med 2017; 2:36. [PMID: 29263845 PMCID: PMC5705701 DOI: 10.1038/s41525-017-0038-z] [Citation(s) in RCA: 89] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2017] [Revised: 10/13/2017] [Accepted: 10/31/2017] [Indexed: 12/26/2022] Open
Abstract
The functions and biomarker potential of circular RNAs (circRNAs) in various cancer types are a rising field of study, as emerging evidence relates circRNAs to tumorigenesis. Here, we profiled the expression of circRNAs in 457 tumors from patients with non-muscle-invasive bladder cancer (NMIBC). We show that a set of highly expressed circRNAs have conserved core splice sites, are associated with Alu repeats, and enriched with Synonymous Constraint Elements as well as microRNA target sites. We identified 113 abundant circRNAs that are differentially expressed between high and low-risk tumor subtypes. Analysis of progression-free survival revealed 13 circRNAs, among them circHIPK3 and circCDYL, where expression correlated with progression independently of the linear transcript and the host gene. In summary, our results demonstrate that abundant circRNAs possess multiple biological features, distinguishing them from low-expressed circRNAs and non-circularized exons, and suggest that circRNAs might serve as a new class of prognostic biomarkers in NMIBC. Expression levels of non-coding “circular” RNA molecules could be used as a prognostic biomarker for patients with early-stage bladder cancer. A team led by Trine Line Hauge Okholm and Jakob Skou Pedersen from Aarhus University Hospital, Denmark, profiled the expression of these loop-forming, potentially gene-regulating RNAs in biopsied tumor samples from 457 patients with bladder cancer that had not invaded nearby muscle tissue. They identified a suite of 113 circular RNAs that were abundant and differentially expressed between patients with different molecular subtypes of bladder cancer. The researchers also found a smaller set of 13 circular RNAs for which expression levels correlated with disease progression. These non-coding RNA molecules, by indicating likely patient outcomes, could potentially serve as future diagnostic aids to inform treatment strategies and decisions.
Collapse
|