1
|
Campos-León K, Ferguson J, Günther T, Wood CD, Wingett SW, Pekel S, Varghese CS, Jones LS, Stockton JD, Várnai C, West MJ, Beggs A, Grundhoff A, Noyvert B, Roberts S, Parish JL. Repression of CADM1 transcription by HPV type 18 is mediated by three-dimensional rearrangement of promoter-enhancer interactions. PLoS Pathog 2025; 21:e1012506. [PMID: 39869645 PMCID: PMC11801731 DOI: 10.1371/journal.ppat.1012506] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2024] [Revised: 02/06/2025] [Accepted: 12/02/2024] [Indexed: 01/29/2025] Open
Abstract
Upon infection, human papillomavirus (HPV) manipulates host cell gene expression to create an environment that is supportive of a productive and persistent infection. The virus-induced changes to the host cell's transcriptome are thought to contribute to carcinogenesis. Here, we show by RNA-sequencing that oncogenic HPV18 episome replication in primary human foreskin keratinocytes (HFKs) drives host transcriptional changes that are consistent between multiple HFK donors. We have previously shown that HPV18 recruits the host protein CTCF to viral episomes to control the differentiation-dependent viral transcriptional programme. Since CTCF is an important regulator of host cell transcription via coordination of epigenetic boundaries and long-range chromosomal interactions, we hypothesised that HPV18 may also manipulate CTCF to contribute to host transcription reprogramming. Analysis of CTCF binding in the host cell genome by ChIP-Seq revealed that while the total number of CTCF binding sites is not altered by the virus, there are a sub-set of CTCF binding sites that are either enriched or depleted of CTCF. Many of these altered sites are clustered within regulatory elements of differentially expressed genes, including the tumour suppressor gene cell adhesion molecule 1 (CADM1), which supresses epithelial cell growth and invasion. We show that HPV18 establishment results in reduced CTCF binding at the CADM1 promoter and upstream enhancer. Loss of CTCF binding is coincident with epigenetic repression of CADM1, in the absence of CpG hypermethylation, while adjacent genes including the transcriptional regulator ZBTB16 are activated. These data indicate that the CADM1 locus is subject to topological rearrangement following HPV18 establishment. We tested this hypothesis using 4C-Seq (circular chromosome confirmation capture-sequencing) and show that HPV18 establishment causes a loss of long-range chromosomal interactions between the CADM1 transcriptional start site and the upstream transcriptional enhancer. These data show that HPV18 manipulates host cell promoter-enhancer interactions to drive transcriptional reprogramming that may contribute to HPV-induced disease progression.
Collapse
Affiliation(s)
- Karen Campos-León
- Department of Cancer and Genomic Sciences, College of Medicine and Health, University of Birmingham, Birmingham, United Kingdom
| | - Jack Ferguson
- Department of Cancer and Genomic Sciences, College of Medicine and Health, University of Birmingham, Birmingham, United Kingdom
| | | | - C. David Wood
- School of Life Sciences, University of Sussex, Brighton, United Kingdom
| | - Steven W. Wingett
- The Babraham Institute, Babraham Research Campus, Cambridge, United Kingdom
| | - Selin Pekel
- Department of Cancer and Genomic Sciences, College of Medicine and Health, University of Birmingham, Birmingham, United Kingdom
| | - Christy S. Varghese
- Department of Cancer and Genomic Sciences, College of Medicine and Health, University of Birmingham, Birmingham, United Kingdom
| | - Leanne S. Jones
- Department of Cancer and Genomic Sciences, College of Medicine and Health, University of Birmingham, Birmingham, United Kingdom
| | - Joanne D. Stockton
- Department of Cancer and Genomic Sciences, College of Medicine and Health, University of Birmingham, Birmingham, United Kingdom
| | - Csilla Várnai
- Department of Cancer and Genomic Sciences, College of Medicine and Health, University of Birmingham, Birmingham, United Kingdom
| | - Michelle J. West
- School of Life Sciences, University of Sussex, Brighton, United Kingdom
| | - Andrew Beggs
- Department of Cancer and Genomic Sciences, College of Medicine and Health, University of Birmingham, Birmingham, United Kingdom
| | | | - Boris Noyvert
- Department of Cancer and Genomic Sciences, College of Medicine and Health, University of Birmingham, Birmingham, United Kingdom
- Birmingham CRUK Centre, University of Birmingham, Birmingham, United Kingdom
| | - Sally Roberts
- Department of Cancer and Genomic Sciences, College of Medicine and Health, University of Birmingham, Birmingham, United Kingdom
| | - Joanna L. Parish
- Department of Cancer and Genomic Sciences, College of Medicine and Health, University of Birmingham, Birmingham, United Kingdom
- National Institute of Health Research, Biomedical Research Centre, University of Birmingham, Birmingham, United Kingdom
| |
Collapse
|
2
|
Ball STM, Celik N, Sayari E, Abdul Kadir L, O’Brien F, Barrett-Jolley R. DeepGANnel: Synthesis of fully annotated single molecule patch-clamp data using generative adversarial networks. PLoS One 2022; 17:e0267452. [PMID: 35536793 PMCID: PMC9089889 DOI: 10.1371/journal.pone.0267452] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2021] [Accepted: 04/09/2022] [Indexed: 11/19/2022] Open
Abstract
Development of automated analysis tools for "single ion channel" recording is hampered by the lack of available training data. For machine learning based tools, very large training sets are necessary with sample-by-sample point labelled data (e.g., 1 sample point every 100microsecond). In an experimental context, such data are labelled with human supervision, and whilst this is feasible for simple experimental analysis, it is infeasible to generate the enormous datasets that would be necessary for a big data approach using hand crafting. In this work we aimed to develop methods to generate simulated ion channel data that is free from assumptions and prior knowledge of noise and underlying hidden Markov models. We successfully leverage generative adversarial networks (GANs) to build an end-to-end pipeline for generating an unlimited amount of labelled training data from a small, annotated ion channel "seed" record, and this needs no prior knowledge of theoretical dynamical ion channel properties. Our method utilises 2D CNNs to maintain the synchronised temporal relationship between the raw and idealised record. We demonstrate the applicability of the method with 5 different data sources and show authenticity with t-SNE and UMAP projection comparisons between real and synthetic data. The model would be easily extendable to other time series data requiring parallel labelling, such as labelled ECG signals or raw nanopore sequencing data.
Collapse
Affiliation(s)
- Sam T. M. Ball
- Faculty of Health and Life Science, University of Liverpool, Liverpool, United Kingdom
| | - Numan Celik
- Faculty of Health and Life Science, University of Liverpool, Liverpool, United Kingdom
| | - Elaheh Sayari
- Faculty of Health and Life Science, University of Liverpool, Liverpool, United Kingdom
| | - Lina Abdul Kadir
- Faculty of Health and Life Science, University of Liverpool, Liverpool, United Kingdom
| | - Fiona O’Brien
- Faculty of Health and Life Science, University of Liverpool, Liverpool, United Kingdom
| | | |
Collapse
|
3
|
Kopacheva E, Yantseva V. Users’ polarisation in dynamic discussion networks: The case of refugee crisis in Sweden. PLoS One 2022; 17:e0262992. [PMID: 35139109 PMCID: PMC8827437 DOI: 10.1371/journal.pone.0262992] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2021] [Accepted: 01/10/2022] [Indexed: 11/19/2022] Open
Abstract
This paper presents a study on the dynamics of sentiment polarisation in the active online discussion communities formed around a controversial topic—immigration. Using a collection of tweets in the Swedish language from 2012 to 2019, we track the development of the communities and their sentiment polarisation trajectories over time and in the context of an exogenous shock represented by the European refugee crisis in 2015. To achieve the goal of the study, we apply methods of network and sentiment analysis to map users’ interactions in the network communities and quantify users’ sentiment polarities. The results of the analysis give little evidence for users’ polarisation in the network and its communities, as well as suggest that the crisis had a limited effect on the polarisation dynamics on this social media platform. Yet, we notice a shift towards more negative tonality of users’ sentiments after the crisis and discuss possible explanations for the above-mentioned observations.
Collapse
Affiliation(s)
- Elizaveta Kopacheva
- Department of Political Science & Centre for Data Intensive Sciences and Applications (DISA), Linnaeus University, Växjö, Sweden
- * E-mail: (EK); (VY)
| | - Victoria Yantseva
- Department of Social Studies & Centre for Data Intensive Sciences and Applications (DISA), Linnaeus University, Växjö, Sweden
- * E-mail: (EK); (VY)
| |
Collapse
|
4
|
Staunton CA, Owen ED, Hemmings K, Vasilaki A, McArdle A, Barrett-Jolley R, Jackson MJ. Skeletal muscle transcriptomics identifies common pathways in nerve crush injury and ageing. Skelet Muscle 2022; 12:3. [PMID: 35093178 PMCID: PMC8800362 DOI: 10.1186/s13395-021-00283-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Accepted: 11/24/2021] [Indexed: 12/16/2022] Open
Abstract
Motor unit remodelling involving repeated denervation and re-innervation occurs throughout life. The efficiency of this process declines with age contributing to neuromuscular deficits. This study investigated differentially expressed genes (DEG) in muscle following peroneal nerve crush to model motor unit remodelling in C57BL/6 J mice. Muscle RNA was isolated at 3 days post-crush, RNA libraries were generated using poly-A selection, sequenced and analysed using gene ontology and pathway tools. Three hundred thirty-four DEG were found in quiescent muscle from (26mnth) old compared with (4-6mnth) adult mice and these same DEG were present in muscle from adult mice following nerve crush. Peroneal crush induced 7133 DEG in muscles of adult and 699 DEG in muscles from old mice, although only one DEG (ZCCHC17) was found when directly comparing nerve-crushed muscles from old and adult mice. This analysis revealed key differences in muscle responses which may underlie the diminished ability of old mice to repair following nerve injury.
Collapse
Affiliation(s)
- C A Staunton
- MRC- Versus Arthritis Research Centre for Integrated research into Musculoskeletal Ageing (CIMA), Department of Musculoskeletal and Ageing Science, Institute of Life Course and Medical Sciences, University of Liverpool, Liverpool, L7 8TX, UK
| | - E D Owen
- MRC- Versus Arthritis Research Centre for Integrated research into Musculoskeletal Ageing (CIMA), Department of Musculoskeletal and Ageing Science, Institute of Life Course and Medical Sciences, University of Liverpool, Liverpool, L7 8TX, UK
| | - K Hemmings
- MRC- Versus Arthritis Research Centre for Integrated research into Musculoskeletal Ageing (CIMA), Department of Musculoskeletal and Ageing Science, Institute of Life Course and Medical Sciences, University of Liverpool, Liverpool, L7 8TX, UK
| | - A Vasilaki
- MRC- Versus Arthritis Research Centre for Integrated research into Musculoskeletal Ageing (CIMA), Department of Musculoskeletal and Ageing Science, Institute of Life Course and Medical Sciences, University of Liverpool, Liverpool, L7 8TX, UK
| | - A McArdle
- MRC- Versus Arthritis Research Centre for Integrated research into Musculoskeletal Ageing (CIMA), Department of Musculoskeletal and Ageing Science, Institute of Life Course and Medical Sciences, University of Liverpool, Liverpool, L7 8TX, UK
| | - R Barrett-Jolley
- MRC- Versus Arthritis Research Centre for Integrated research into Musculoskeletal Ageing (CIMA), Department of Musculoskeletal and Ageing Science, Institute of Life Course and Medical Sciences, University of Liverpool, Liverpool, L7 8TX, UK
| | - M J Jackson
- MRC- Versus Arthritis Research Centre for Integrated research into Musculoskeletal Ageing (CIMA), Department of Musculoskeletal and Ageing Science, Institute of Life Course and Medical Sciences, University of Liverpool, Liverpool, L7 8TX, UK.
| |
Collapse
|
5
|
Arnol D, Schapiro D, Bodenmiller B, Saez-Rodriguez J, Stegle O. Modeling Cell-Cell Interactions from Spatial Molecular Data with Spatial Variance Component Analysis. Cell Rep 2020; 29:202-211.e6. [PMID: 31577949 PMCID: PMC6899515 DOI: 10.1016/j.celrep.2019.08.077] [Citation(s) in RCA: 137] [Impact Index Per Article: 27.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2018] [Revised: 04/11/2019] [Accepted: 08/22/2019] [Indexed: 12/22/2022] Open
Abstract
Technological advances enable assaying multiplexed spatially resolved RNA and protein expression profiling of individual cells, thereby capturing molecular variations in physiological contexts. While these methods are increasingly accessible, computational approaches for studying the interplay of the spatial structure of tissues and cell-cell heterogeneity are only beginning to emerge. Here, we present spatial variance component analysis (SVCA), a computational framework for the analysis of spatial molecular data. SVCA enables quantifying different dimensions of spatial variation and in particular quantifies the effect of cell-cell interactions on gene expression. In a breast cancer Imaging Mass Cytometry dataset, our model yields interpretable spatial variance signatures, which reveal cell-cell interactions as a major driver of protein expression heterogeneity. Applied to high-dimensional imaging-derived RNA data, SVCA identifies plausible gene families that are linked to cell-cell interactions. SVCA is available as a free software tool that can be widely applied to spatial data from different technologies.
Collapse
Affiliation(s)
- Damien Arnol
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Denis Schapiro
- Institute of Molecular Life Sciences, University of Zurich, Zurich, Switzerland; Life Science Zurich Graduate School, ETH Zurich and University of Zurich, Zurich, Switzerland
| | - Bernd Bodenmiller
- Institute of Molecular Life Sciences, University of Zurich, Zurich, Switzerland
| | - Julio Saez-Rodriguez
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK; Joint Research Center for Computational Biomedicine, RWTH Aachen University, Faculty of Medicine, Pauwelsstrasse 19, 52074 Aachen, Germany; Institute for Computational Biomedicine, Heidelberg University, Faculty of Medicine, Bioquant, 69120 Heidelberg.
| | - Oliver Stegle
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK; European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany; Division of Computational Genomics and Systems Genetics, German Cancer Research Center, 69120 Heidelberg, Germany.
| |
Collapse
|
6
|
Blair JPM, Bager C, Platt A, Karsdal M, Bay-Jensen AC. Identification of pathological RA endotypes using blood-based biomarkers reflecting tissue metabolism. A retrospective and explorative analysis of two phase III RA studies. PLoS One 2019; 14:e0219980. [PMID: 31339920 PMCID: PMC6655687 DOI: 10.1371/journal.pone.0219980] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2019] [Accepted: 07/05/2019] [Indexed: 12/31/2022] Open
Abstract
There is an increasing demand for accurate endotyping of patients according to their pathogenesis to allow more targeted treatment. We explore a combination of blood-based joint tissue metabolites (neoepitopes) to enable patient clustering through distinct disease profiles. We analysed data from two RA studies (LITHE (N = 574, follow-up 24 and 52 weeks), OSKIRA-1 (N = 131, follow-up 24 weeks)). Two osteoarthritis (OA) studies (SMC01 (N = 447), SMC02 (N = 81)) were included as non-RA comparators. Specific tissue-derived neoepitopes measured at baseline, included: C2M (cartilage degradation); CTX-I and PINP (bone turnover); C1M and C3M (interstitial matrix degradation); CRPM (CRP metabolite) and VICM (macrophage activity). Clustering was performed to identify putative endotypes. We identified five clusters (A-E). Clusters A and B were characterized by generally higher levels of biomarkers than other clusters, except VICM which was significantly higher in cluster B than in cluster A (p<0.001). Biomarker levels in Cluster C were all close to the median, whilst Cluster D was characterised by low levels of all biomarkers. Cluster E also had low levels of most biomarkers, but with significantly higher levels of CTX-I compared to cluster D. There was a significant difference in ΔSHP score observed at 52 weeks (p<0.05). We describe putative RA endotypes based on biomarkers reflecting joint tissue metabolism. These endotypes differ in their underlining pathogenesis, and may in the future have utility for patient treatment selection.
Collapse
Affiliation(s)
- J. P. M. Blair
- ProScion, Herlev, Denmark
- University of Copenhagen, Faculty of Health and Medical Sciences, Copenhagen, Denmark
- * E-mail:
| | | | - A. Platt
- Target & Translational Science, Respiratory, Inflammation and Autoimmunity (RIA), IMED Biotech Unit, AstraZeneca, Gothenburg, Sweden
| | - M. Karsdal
- Rheumatology, Nordic Bioscience, Biomarkers and Research, Herlev, Denmark
| | - A. -C. Bay-Jensen
- Rheumatology, Nordic Bioscience, Biomarkers and Research, Herlev, Denmark
| |
Collapse
|
7
|
Lee LH, Halu A, Morgan S, Iwata H, Aikawa M, Singh SA. XINA: A Workflow for the Integration of Multiplexed Proteomics Kinetics Data with Network Analysis. J Proteome Res 2019; 18:775-781. [PMID: 30370770 DOI: 10.1021/acs.jproteome.8b00615] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
Quantitative proteomics experiments, using for instance isobaric tandem mass tagging approaches, are conducive to measuring changes in protein abundance over multiple time points in response to one or more conditions or stimulations. The aim is often to determine which proteins exhibit similar patterns within and across experimental conditions, since proteins with coabundance patterns may have common molecular functions related to a given stimulation. In order to facilitate the identification and analyses of coabundance patterns within and across conditions, we previously developed a software inspired by the isobaric mass tagging method itself. Specifically, multiple data sets are tagged in silico and combined for subsequent subgrouping into multiple clusters within a single output depicting the variation across all conditions, converting a typical inter-data-set comparison into an intra-data-set comparison. An updated version of our software, XINA, not only extracts coabundance profiles within and across experiments but also incorporates protein-protein interaction databases and integrative resources such as KEGG to infer interactors and molecular functions, respectively, and produces intuitive graphical outputs. In this report, we compare the kinetics profiles of >5600 unique proteins derived from three macrophage cell culture experiments and demonstrate through intuitive visualizations that XINA identifies key regulators of macrophage activation via their coabundance patterns.
Collapse
Affiliation(s)
- Lang Ho Lee
- Center for Interdisciplinary Cardiovascular Sciences, Cardiovascular Division , Brigham and Women's Hospital , Harvard Medical School, Boston , Massachusetts 02115 , United States
| | - Arda Halu
- Center for Interdisciplinary Cardiovascular Sciences, Cardiovascular Division , Brigham and Women's Hospital , Harvard Medical School, Boston , Massachusetts 02115 , United States
- Channing Division of Network Medicine , Brigham and Women's Hospital , Harvard Medical School, Boston , Massachusetts 02115 , United States
| | - Stephanie Morgan
- Center for Interdisciplinary Cardiovascular Sciences, Cardiovascular Division , Brigham and Women's Hospital , Harvard Medical School, Boston , Massachusetts 02115 , United States
| | - Hiroshi Iwata
- Center for Interdisciplinary Cardiovascular Sciences, Cardiovascular Division , Brigham and Women's Hospital , Harvard Medical School, Boston , Massachusetts 02115 , United States
| | - Masanori Aikawa
- Center for Interdisciplinary Cardiovascular Sciences, Cardiovascular Division , Brigham and Women's Hospital , Harvard Medical School, Boston , Massachusetts 02115 , United States
- Channing Division of Network Medicine , Brigham and Women's Hospital , Harvard Medical School, Boston , Massachusetts 02115 , United States
- Center for Excellence in Vascular Biology , Brigham and Women's Hospital , Harvard Medical School, Boston , Massachusetts 02115 , United States
| | - Sasha A Singh
- Center for Interdisciplinary Cardiovascular Sciences, Cardiovascular Division , Brigham and Women's Hospital , Harvard Medical School, Boston , Massachusetts 02115 , United States
| |
Collapse
|
8
|
Demidenko E. The next-generation K-means algorithm. Stat Anal Data Min 2018; 11:153-166. [PMID: 30073045 PMCID: PMC6062903 DOI: 10.1002/sam.11379] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2016] [Revised: 03/29/2018] [Accepted: 04/06/2018] [Indexed: 11/09/2022]
Abstract
Typically, when referring to a model-based classification, the mixture distribution approach is understood. In contrast, we revive the hard-classification model-based approach developed by Banfield and Raftery (1993) for which K-means is equivalent to the maximum likelihood (ML) estimation. The next-generation K-means algorithm does not end after the classification is achieved, but moves forward to answer the following fundamental questions: Are there clusters, how many clusters are there, what are the statistical properties of the estimated means and index sets, what is the distribution of the coefficients in the clusterwise regression, and how to classify multilevel data? The statistical model-based approach for the K-means algorithm is the key, because it allows statistical simulations and studying the properties of classification following the track of the classical statistics. This paper illustrates the application of the ML classification to testing the no-clusters hypothesis, to studying various methods for selection of the number of clusters using simulations, robust clustering using Laplace distribution, studying properties of the coefficients in clusterwise regression, and finally to multilevel data by marrying the variance components model with K-means.
Collapse
Affiliation(s)
- Eugene Demidenko
- Department of Biomedical Data Science and Department of MathematicsDartmouth CollegeHanoverNew Hampshire
| |
Collapse
|