1
|
Mirzaei-nasab F, Majd A, Seyedena Y, Hosseinkhan N, Farahani N, Hashemi M. Integrative analysis of exosomal ncRNAs and their regulatory networks in liver cancer progression. Pract Lab Med 2025; 45:e00464. [PMID: 40226122 PMCID: PMC11992429 DOI: 10.1016/j.plabm.2025.e00464] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2024] [Revised: 01/19/2025] [Accepted: 03/07/2025] [Indexed: 04/15/2025] Open
Abstract
Background Hepatocellular carcinoma (HCC) is a significant global health challenge with complex molecular underpinnings. Recent advancements in understanding the role of non-coding RNAs (ncRNAs) and exosomes in cancer biology have opened new avenues for research into potential diagnostic and therapeutic strategies. Methods This study utilized a comprehensive approach to analyze gene expression patterns and regulatory networks in HCC. We integrated RNA sequencing data gathered from both tissue samples and exosomes. The WGCNA and limma R packages were employed to construct co-expression networks and identify differentially expressed ncRNAs, including long non-coding RNAs (lncRNAs) and circular RNAs (circRNAs). Results Our analysis demonstrated distinct expression profiles of various ncRNAs in HCC, revealing their intricate interactions with cancer-related genes. Key findings include the identification of a network of microRNAs that interact with selected lncRNAs and their potential roles as biomarkers. Moreover, exosomal RNA was shown to effectively reflect tissue-specific gene expression changes. Conclusions The results of this study highlight the significance of exosomal ncRNAs in the progression of liver cancer, suggesting their potential as both diagnostic biomarkers and therapeutic targets. Future research should focus on the functional implications of these ncRNAs to further elucidate their roles in HCC and explore their applications in clinical settings.
Collapse
Affiliation(s)
- Farzin Mirzaei-nasab
- Department of Genetics, Faculty of Biological Sciences, North Tehran Branch, Islamic Azad University, Tehran, Iran, Sure
- Basic and Molecular Epidemiology of Gastrointestinal Disorders Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Ahmad Majd
- Department of Genetics, Faculty of Biological Sciences, North Tehran Branch, Islamic Azad University, Tehran, Iran, Sure
| | - Yousef Seyedena
- Department of Genetics, Faculty of Biological Sciences, North Tehran Branch, Islamic Azad University, Tehran, Iran, Sure
| | - Nazanin Hosseinkhan
- Endocrine Research Center, Institute of Endocrinology and Metabolism, Iran University of Medical Sciences, Tehran, Iran
| | - Najma Farahani
- Farhikhtegan Medical Convergence Sciences Research Center, Farhikhtegan Hospital Tehran Medical Sciences, Islamic Azad University, Tehran, Iran
| | - Mehrdad Hashemi
- Farhikhtegan Medical Convergence Sciences Research Center, Farhikhtegan Hospital Tehran Medical Sciences, Islamic Azad University, Tehran, Iran
- Department of Genetics, Faculty of Advanced Science and Technology, Tehran Medical Sciences, Islamic Azad University, Tehran, Iran
| |
Collapse
|
2
|
Hänggi NV, Neubauer J, Marti Y, Banemann R, Kulstein G, Courts C, Gosch A, Hadrys T, Haas C, Dørum G. Assessing transcriptomic signatures of aging: Testing an mRNA marker panel for forensic age estimation of blood samples. Forensic Sci Int Genet 2025; 78:103282. [PMID: 40209357 DOI: 10.1016/j.fsigen.2025.103282] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2024] [Revised: 03/27/2025] [Accepted: 03/31/2025] [Indexed: 04/12/2025]
Abstract
Estimating the age of an unknown perpetrator can be a valuable tool in narrowing down a group of suspects. Research efforts to estimate the age of a stain donor have mainly focused on epigenetic modifications, but there is evidence that RNA expression patterns, i.e. the composition of the transcriptome, change with increasing age, which could be a promising molecular alternative for age prediction. In a previous study, we identified a total of 508 mRNA markers with age related expression from two blood whole transcriptome sequencing data sets, using differential expression analysis with DESeq2 and marker selection with lasso regression. For this study, the selected markers from both approaches were combined into an RNA-specific targeted MPS assay for the Ion Torrent platform and evaluated with 100 EDTA blood samples from healthy donors (aged between 23 and 73 years). We compared three different normalization methods for the obtained sequencing data and investigated the performance of various regression techniques for age prediction. The model based on elastic net regression and dSVA-normalized data exhibited the most robust performance, achieving an MAE of 9.29 years and a correlation of 0.57 between the chronological and predicted age. Although the use of a targeted approach instead of RNA-Seq offers several advantages in a forensic setting, we observed a considerable amount of unwanted variation in the targeted sequencing data. We conclude that it is challenging to detect distinct signals associated with chronological age.
Collapse
Affiliation(s)
| | - Jacqueline Neubauer
- Zurich Institute of Forensic Medicine, University of Zurich, Zurich, Switzerland
| | - Yael Marti
- Zurich Institute of Forensic Medicine, University of Zurich, Zurich, Switzerland
| | | | | | - Cornelius Courts
- University Hospital of Cologne, Institute of Legal Medicine, Cologne, Germany
| | - Annica Gosch
- University Hospital of Cologne, Institute of Legal Medicine, Cologne, Germany
| | - Thorsten Hadrys
- Bavarian State Criminal Police Office (BLKA), Munich, Germany
| | - Cordula Haas
- Zurich Institute of Forensic Medicine, University of Zurich, Zurich, Switzerland.
| | - Guro Dørum
- Zurich Institute of Forensic Medicine, University of Zurich, Zurich, Switzerland; Nofima - Norwegian Institute of Food, Fisheries and Aquaculture Research, Ås, Norway
| |
Collapse
|
3
|
Casaletto JA, Scott RT, Myrick M, Mackintosh G, Chok H, Saravia-Butler A, Hoarfrost A, Galazka JM, Sanders LM, Costes SV. Analyzing the relationship between gene expression and phenotype in space-flown mice using a causal inference machine learning ensemble. Sci Rep 2025; 15:2363. [PMID: 39824847 PMCID: PMC11748630 DOI: 10.1038/s41598-024-81394-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2024] [Accepted: 11/26/2024] [Indexed: 01/20/2025] Open
Abstract
Spaceflight has several detrimental effects on human and rodent health. For example, liver dysfunction is a common phenotype observed in space-flown rodents, and this dysfunction is partially reflected in transcriptomic changes. Studies linking transcriptomics with liver dysfunction rely on tools which exploit correlation, but these tools make no attempt to disambiguate true correlations from spurious ones. In this work, we use a machine learning ensemble of causal inference methods called the Causal Research and Inference Search Platform (CRISP) which was developed to predict causal features of a binary response variable from high-dimensional input. We used CRISP to identify genes robustly correlated with a lipid density phenotype using transcriptomic and histological data from the NASA Open Science Data Repository (OSDR). Our approach identified genes and molecular targets not predicted by previous traditional differential gene expression analyses. These genes are likely to play a pivotal role in the liver dysfunction observed in space-flown rodents, and this work opens the door to identifying novel countermeasures for space travel.
Collapse
Affiliation(s)
- James A Casaletto
- Blue Marble Space Institute of Science, NASA Ames, Mountain View, USA.
| | | | - Makenna Myrick
- Department of Chemistry, University of Florida, Gainesville, USA
| | | | - Hamed Chok
- Blue Marble Space Institute of Science, NASA Ames, Mountain View, USA
| | | | | | | | | | | |
Collapse
|
4
|
Vo DHT, Thorne T. Shrinkage estimation of gene interaction networks in single-cell RNA sequencing data. BMC Bioinformatics 2024; 25:339. [PMID: 39462345 PMCID: PMC11515282 DOI: 10.1186/s12859-024-05946-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2024] [Accepted: 09/23/2024] [Indexed: 10/29/2024] Open
Abstract
BACKGROUND Gene interaction networks are graphs in which nodes represent genes and edges represent functional interactions between them. These interactions can be at multiple levels, for instance, gene regulation, protein-protein interaction, or metabolic pathways. To analyse gene interaction networks at a large scale, gene co-expression network analysis is often applied on high-throughput gene expression data such as RNA sequencing data. With the advance in sequencing technology, expression of genes can be measured in individual cells. Single-cell RNA sequencing (scRNAseq) provides insights of cellular development, differentiation and characteristics at the transcriptomic level. High sparsity and high-dimensional data structures pose challenges in scRNAseq data analysis. RESULTS In this study, a sparse inverse covariance matrix estimation framework for scRNAseq data is developed to capture direct functional interactions between genes. Comparative analyses highlight high performance and fast computation of Stein-type shrinkage in high-dimensional data using simulated scRNAseq data. Data transformation approaches also show improvement in performance of shrinkage methods in non-Gaussian distributed data. Zero-inflated modelling of scRNAseq data based on a negative binomial distribution enhances shrinkage performance in zero-inflated data without interference on non zero-inflated count data. CONCLUSION The proposed framework broadens application of graphical model in scRNAseq analysis with flexibility in sparsity of count data resulting from dropout events, high performance, and fast computational time. Implementation of the framework is in a reproducible Snakemake workflow https://github.com/calathea24/ZINBGraphicalModel and R package ZINBStein https://github.com/calathea24/ZINBStein .
Collapse
Affiliation(s)
- Duong H T Vo
- Computer Science Research Centre, University of Surrey, Guildford, UK
| | - Thomas Thorne
- Computer Science Research Centre, University of Surrey, Guildford, UK.
| |
Collapse
|
5
|
Agraz M, Goksuluk D, Zhang P, Choi BR, Clements RT, Choudhary G, Karniadakis GE. ML-GAP: machine learning-enhanced genomic analysis pipeline using autoencoders and data augmentation. Front Genet 2024; 15:1442759. [PMID: 39399219 PMCID: PMC11467662 DOI: 10.3389/fgene.2024.1442759] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2024] [Accepted: 09/03/2024] [Indexed: 10/15/2024] Open
Abstract
Introduction The advent of RNA sequencing (RNA-Seq) has significantly advanced our understanding of the transcriptomic landscape, revealing intricate gene expression patterns across biological states and conditions. However, the complexity and volume of RNA-Seq data pose challenges in identifying differentially expressed genes (DEGs), critical for understanding the molecular basis of diseases like cancer. Methods We introduce a novel Machine Learning-Enhanced Genomic Data Analysis Pipeline (ML-GAP) that incorporates autoencoders and innovative data augmentation strategies, notably the MixUp method, to overcome these challenges. By creating synthetic training examples through a linear combination of input pairs and their labels, MixUp significantly enhances the model's ability to generalize from the training data to unseen examples. Results Our results demonstrate the ML-GAP's superiority in accuracy, efficiency, and insights, particularly crediting the MixUp method for its substantial contribution to the pipeline's effectiveness, advancing greatly genomic data analysis and setting a new standard in the field. Discussion This, in turn, suggests that ML-GAP has the potential to perform more accurate detection of DEGs but also offers new avenues for therapeutic intervention and research. By integrating explainable artificial intelligence (XAI) techniques, ML-GAP ensures a transparent and interpretable analysis, highlighting the significance of identified genetic markers.
Collapse
Affiliation(s)
- Melih Agraz
- Division of Applied Mathematics, Brown University, Providence, RI, United States
- Department of Statistics, Giresun University, Giresun, Türkiye
| | - Dincer Goksuluk
- Department of Biostatistics, Erciyes University, Kayseri, Türkiye
| | - Peng Zhang
- Vascular Research Laboratory, VA Providence Healthcare System, Providence, RI, United States
- Division of Cardiology, Department of Medicine, Alpert Medical School of Brown University, Providence, RI, United States
| | - Bum-Rak Choi
- Division of Cardiology, Department of Medicine, Alpert Medical School of Brown University, Providence, RI, United States
- Cardiovascular Research Center, Rhode Island Hospital, Providence, RI, United States
| | - Richard T. Clements
- Vascular Research Laboratory, VA Providence Healthcare System, Providence, RI, United States
- Department of Biomedical and Pharmaceutical Sciences, University of Rhode Island College of Pharmacy, South Kingston, RI, United States
| | - Gaurav Choudhary
- Vascular Research Laboratory, VA Providence Healthcare System, Providence, RI, United States
- Division of Cardiology, Department of Medicine, Alpert Medical School of Brown University, Providence, RI, United States
- Cardiovascular Research Center, Rhode Island Hospital, Providence, RI, United States
| | - George Em Karniadakis
- Division of Applied Mathematics, Brown University, Providence, RI, United States
- School of Engineering, Brown University, Providence, RI, United States
| |
Collapse
|
6
|
Smail C, Montgomery SB. RNA Sequencing in Disease Diagnosis. Annu Rev Genomics Hum Genet 2024; 25:353-367. [PMID: 38360541 DOI: 10.1146/annurev-genom-021623-121812] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/17/2024]
Abstract
RNA sequencing (RNA-seq) enables the accurate measurement of multiple transcriptomic phenotypes for modeling the impacts of disease variants. Advances in technologies, experimental protocols, and analysis strategies are rapidly expanding the application of RNA-seq to identify disease biomarkers, tissue- and cell-type-specific impacts, and the spatial localization of disease-associated mechanisms. Ongoing international efforts to construct biobank-scale transcriptomic repositories with matched genomic data across diverse population groups are further increasing the utility of RNA-seq approaches by providing large-scale normative reference resources. The availability of these resources, combined with improved computational analysis pipelines, has enabled the detection of aberrant transcriptomic phenotypes underlying rare diseases. Further expansion of these resources, across both somatic and developmental tissues, is expected to soon provide unprecedented insights to resolve disease origin, mechanism of action, and causal gene contributions, suggesting the continued high utility of RNA-seq in disease diagnosis.
Collapse
Affiliation(s)
- Craig Smail
- Genomic Medicine Center, Children's Mercy Research Institute, Children's Mercy Kansas City, Kansas City, Missouri, USA;
| | - Stephen B Montgomery
- Department of Biomedical Data Science, Department of Genetics, and Department of Pathology, Stanford University School of Medicine, Stanford, California, USA;
| |
Collapse
|
7
|
Suita Y, Bright H, Pu Y, Toruner MD, Idehen J, Tapinos N, Singh R. Machine learning on multiple epigenetic features reveals H3K27Ac as a driver of gene expression prediction across patients with glioblastoma. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.25.600585. [PMID: 38979226 PMCID: PMC11230286 DOI: 10.1101/2024.06.25.600585] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
Cancer cells show remarkable plasticity and can switch lineages in response to the tumor microenvironment. Cellular plasticity drives invasiveness and metastasis and helps cancer cells to evade therapy by developing resistance to radiation and cytotoxic chemotherapy. Increased understanding of cell fate determination through epigenetic reprogramming is critical to discover how cancer cells achieve transcriptomic and phenotypic plasticity. Glioblastoma is a perfect example of cancer evolution where cells retain an inherent level of plasticity through activation or maintenance of progenitor developmental programs. However, the principles governing epigenetic drivers of cellular plasticity in glioblastoma remain poorly understood. Here, using machine learning (ML) we employ cross-patient prediction of transcript expression using a combination of epigenetic features (ATAC-seq, CTCF ChIP-seq, RNAPII ChIP-seq, H3K27Ac ChIP-seq, and RNA-seq) of glioblastoma stem cells (GSCs). We investigate different ML and deep learning (DL) models for this task and build our final pipeline using XGBoost. The model trained on one patient generalizes to another one suggesting that the epigenetic signals governing gene transcription are consistent across patients even if GSCs can be very different. We demonstrate that H3K27Ac is the epigenetic feature providing the most significant contribution to cross-patient prediction of gene expression. In addition, using H3K27Ac signals from patients-derived GSCs, we can predict gene expression of human neural crest stem cells suggesting a shared developmental epigenetic trajectory between subpopulations of these malignant and benign stem cells. Our cross-patient ML/DL models determine weighted patterns of influence of epigenetic marks on gene expression across patients with glioblastoma and between GSCs and neural crest stem cells. We propose that broader application of this analysis could reshape our view of glioblastoma tumor evolution and inform the design of new epigenetic targeting therapies.
Collapse
Affiliation(s)
- Yusuke Suita
- Laboratory of Cancer Epigenetics and Plasticity, Department of Neurosurgery, Brown University, Providence, RI 02903, USA
| | - Hardy Bright
- Data Science Institute, Brown University, Providence, RI 02903, USA
| | - Yuan Pu
- Center for Computational Molecular Biology, Brown University, Providence, RI 02903, USA
| | - Merih Deniz Toruner
- Laboratory of Cancer Epigenetics and Plasticity, Department of Neurosurgery, Brown University, Providence, RI 02903, USA
- Center for Computational Molecular Biology, Brown University, Providence, RI 02903, USA
| | - Jordan Idehen
- Department of Computer Science, Brown University, Providence, RI 02903, USA
| | - Nikos Tapinos
- Laboratory of Cancer Epigenetics and Plasticity, Department of Neurosurgery, Brown University, Providence, RI 02903, USA
- Brown RNA Center, Brown University, Providence, RI 02903, USA
| | - Ritambhara Singh
- Department of Computer Science, Brown University, Providence, RI 02903, USA
- Center for Computational Molecular Biology, Brown University, Providence, RI 02903, USA
| |
Collapse
|
8
|
Tiong KL, Luzhbin D, Yeang CH. Assessing transcriptomic heterogeneity of single-cell RNASeq data by bulk-level gene expression data. BMC Bioinformatics 2024; 25:209. [PMID: 38867193 PMCID: PMC11167951 DOI: 10.1186/s12859-024-05825-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Accepted: 06/03/2024] [Indexed: 06/14/2024] Open
Abstract
BACKGROUND Single-cell RNA sequencing (sc-RNASeq) data illuminate transcriptomic heterogeneity but also possess a high level of noise, abundant missing entries and sometimes inadequate or no cell type annotations at all. Bulk-level gene expression data lack direct information of cell population composition but are more robust and complete and often better annotated. We propose a modeling framework to integrate bulk-level and single-cell RNASeq data to address the deficiencies and leverage the mutual strengths of each type of data and enable a more comprehensive inference of their transcriptomic heterogeneity. Contrary to the standard approaches of factorizing the bulk-level data with one algorithm and (for some methods) treating single-cell RNASeq data as references to decompose bulk-level data, we employed multiple deconvolution algorithms to factorize the bulk-level data, constructed the probabilistic graphical models of cell-level gene expressions from the decomposition outcomes, and compared the log-likelihood scores of these models in single-cell data. We term this framework backward deconvolution as inference operates from coarse-grained bulk-level data to fine-grained single-cell data. As the abundant missing entries in sc-RNASeq data have a significant effect on log-likelihood scores, we also developed a criterion for inclusion or exclusion of zero entries in log-likelihood score computation. RESULTS We selected nine deconvolution algorithms and validated backward deconvolution in five datasets. In the in-silico mixtures of mouse sc-RNASeq data, the log-likelihood scores of the deconvolution algorithms were strongly anticorrelated with their errors of mixture coefficients and cell type specific gene expression signatures. In the true bulk-level mouse data, the sample mixture coefficients were unknown but the log-likelihood scores were strongly correlated with accuracy rates of inferred cell types. In the data of autism spectrum disorder (ASD) and normal controls, we found that ASD brains possessed higher fractions of astrocytes and lower fractions of NRGN-expressing neurons than normal controls. In datasets of breast cancer and low-grade gliomas (LGG), we compared the log-likelihood scores of three simple hypotheses about the gene expression patterns of the cell types underlying the tumor subtypes. The model that tumors of each subtype were dominated by one cell type persistently outperformed an alternative model that each cell type had elevated expression in one gene group and tumors were mixtures of those cell types. Superiority of the former model is also supported by comparing the real breast cancer sc-RNASeq clusters with those generated by simulated sc-RNASeq data. CONCLUSIONS The results indicate that backward deconvolution serves as a sensible model selection tool for deconvolution algorithms and facilitates discerning hypotheses about cell type compositions underlying heterogeneous specimens such as tumors.
Collapse
Affiliation(s)
- Khong-Loon Tiong
- Institute of Statistical Science, Academia Sinica, Taipei, Taiwan
| | - Dmytro Luzhbin
- Institute of Statistical Science, Academia Sinica, Taipei, Taiwan
| | | |
Collapse
|
9
|
Yuan Y, Li H, Sreeram K, Malankhanova T, Boddu R, Strader S, Chang A, Bryant N, Yacoubian TA, Standaert DG, Erb M, Moore DJ, Sanders LH, Lutz MW, Velmeshev D, West AB. Single molecule array measures of LRRK2 kinase activity in serum link Parkinson's disease severity to peripheral inflammation. Mol Neurodegener 2024; 19:47. [PMID: 38862989 PMCID: PMC11167795 DOI: 10.1186/s13024-024-00738-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2024] [Accepted: 06/02/2024] [Indexed: 06/13/2024] Open
Abstract
BACKGROUND LRRK2-targeting therapeutics that inhibit LRRK2 kinase activity have advanced to clinical trials in idiopathic Parkinson's disease (iPD). LRRK2 phosphorylates Rab10 on endolysosomes in phagocytic cells to promote some types of immunological responses. The identification of factors that regulate LRRK2-mediated Rab10 phosphorylation in iPD, and whether phosphorylated-Rab10 levels change in different disease states, or with disease progression, may provide insights into the role of Rab10 phosphorylation in iPD and help guide therapeutic strategies targeting this pathway. METHODS Capitalizing on past work demonstrating LRRK2 and phosphorylated-Rab10 interact on vesicles that can shed into biofluids, we developed and validated a high-throughput single-molecule array assay to measure extracellular pT73-Rab10. Ratios of pT73-Rab10 to total Rab10 measured in biobanked serum samples were compared between informative groups of transgenic mice, rats, and a deeply phenotyped cohort of iPD cases and controls. Multivariable and weighted correlation network analyses were used to identify genetic, transcriptomic, clinical, and demographic variables that predict the extracellular pT73-Rab10 to total Rab10 ratio. RESULTS pT73-Rab10 is absent in serum from Lrrk2 knockout mice but elevated by LRRK2 and VPS35 mutations, as well as SNCA expression. Bone-marrow transplantation experiments in mice show that serum pT73-Rab10 levels derive primarily from circulating immune cells. The extracellular ratio of pT73-Rab10 to total Rab10 is dynamic, increasing with inflammation and rapidly decreasing with LRRK2 kinase inhibition. The ratio of pT73-Rab10 to total Rab10 is elevated in iPD patients with greater motor dysfunction, irrespective of disease duration, age, sex, or the usage of PD-related or anti-inflammatory medications. pT73-Rab10 to total Rab10 ratios are associated with neutrophil degranulation, antigenic responses, and suppressed platelet activation. CONCLUSIONS The extracellular serum ratio of pT73-Rab10 to total Rab10 is a novel pharmacodynamic biomarker for LRRK2-linked innate immune activation associated with disease severity in iPD. We propose that those iPD patients with higher serum pT73-Rab10 levels may benefit from LRRK2-targeting therapeutics that mitigate associated deleterious immunological responses.
Collapse
Affiliation(s)
- Yuan Yuan
- Duke Center for Neurodegeneration and Neurotheraputics, Duke University, Durham, NC, USA
- Department of Pharmacology and Cancer Biology, Duke University, Durham, NC, USA
| | - Huizhong Li
- Duke Center for Neurodegeneration and Neurotheraputics, Duke University, Durham, NC, USA
- Department of Pharmacology and Cancer Biology, Duke University, Durham, NC, USA
| | - Kashyap Sreeram
- Duke Center for Neurodegeneration and Neurotheraputics, Duke University, Durham, NC, USA
- Department of Pharmacology and Cancer Biology, Duke University, Durham, NC, USA
| | - Tuyana Malankhanova
- Duke Center for Neurodegeneration and Neurotheraputics, Duke University, Durham, NC, USA
- Department of Pharmacology and Cancer Biology, Duke University, Durham, NC, USA
| | - Ravindra Boddu
- Duke Center for Neurodegeneration and Neurotheraputics, Duke University, Durham, NC, USA
- Department of Pharmacology and Cancer Biology, Duke University, Durham, NC, USA
| | - Samuel Strader
- Duke Center for Neurodegeneration and Neurotheraputics, Duke University, Durham, NC, USA
- Department of Pharmacology and Cancer Biology, Duke University, Durham, NC, USA
| | - Allison Chang
- Duke Center for Neurodegeneration and Neurotheraputics, Duke University, Durham, NC, USA
- Department of Pharmacology and Cancer Biology, Duke University, Durham, NC, USA
| | - Nicole Bryant
- Duke Center for Neurodegeneration and Neurotheraputics, Duke University, Durham, NC, USA
- Department of Pharmacology and Cancer Biology, Duke University, Durham, NC, USA
| | - Talene A Yacoubian
- Department of Neurology, University of Alabama at Birmingham, Birmingham, AL, USA
| | - David G Standaert
- Department of Neurodegenerative Science, Van Andel Institute, Grand Rapids, MI, USA
| | - Madalynn Erb
- Department of Neurodegenerative Science, Van Andel Institute, Grand Rapids, MI, USA
| | - Darren J Moore
- Department of Neurodegenerative Science, Van Andel Institute, Grand Rapids, MI, USA
| | - Laurie H Sanders
- Duke Center for Neurodegeneration and Neurotheraputics, Duke University, Durham, NC, USA
- Department of Neurology, Duke University, Durham, NC, USA
- Department of Pathology, Duke University, Durham, NC, USA
| | - Michael W Lutz
- Department of Neurology, Duke University, Durham, NC, USA
- Department of Pathology, Duke University, Durham, NC, USA
| | | | - Andrew B West
- Duke Center for Neurodegeneration and Neurotheraputics, Duke University, Durham, NC, USA.
- Department of Pharmacology and Cancer Biology, Duke University, Durham, NC, USA.
- Department of Neurology, University of Alabama at Birmingham, Birmingham, AL, USA.
- Department of Neurology, Duke University, Durham, NC, USA.
- Department of Neurobiology, Duke University, Durham, NC, USA.
| |
Collapse
|
10
|
Wang B, Luan Y. Evaluation of normalization methods for predicting quantitative phenotypes in metagenomic data analysis. Front Genet 2024; 15:1369628. [PMID: 38903761 PMCID: PMC11188486 DOI: 10.3389/fgene.2024.1369628] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Accepted: 05/13/2024] [Indexed: 06/22/2024] Open
Abstract
Genotype-to-phenotype mapping is an essential problem in the current genomic era. While qualitative case-control predictions have received significant attention, less emphasis has been placed on predicting quantitative phenotypes. This emerging field holds great promise in revealing intricate connections between microbial communities and host health. However, the presence of heterogeneity in microbiome datasets poses a substantial challenge to the accuracy of predictions and undermines the reproducibility of models. To tackle this challenge, we investigated 22 normalization methods that aimed at removing heterogeneity across multiple datasets, conducted a comprehensive review of them, and evaluated their effectiveness in predicting quantitative phenotypes in three simulation scenarios and 31 real datasets. The results indicate that none of these methods demonstrate significant superiority in predicting quantitative phenotypes or attain a noteworthy reduction in Root Mean Squared Error (RMSE) of the predictions. Given the frequent occurrence of batch effects and the satisfactory performance of batch correction methods in predicting datasets affected by these effects, we strongly recommend utilizing batch correction methods as the initial step in predicting quantitative phenotypes. In summary, the performance of normalization methods in predicting metagenomic data remains a dynamic and ongoing research area. Our study contributes to this field by undertaking a comprehensive evaluation of diverse methods and offering valuable insights into their effectiveness in predicting quantitative phenotypes.
Collapse
Affiliation(s)
- Beibei Wang
- Frontier Science Center for Nonlinear Expectations, Ministry of Education, Qingdao, China
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao, China
- School of Mathematics, Shandong University, Jinan, China
| | - Yihui Luan
- Frontier Science Center for Nonlinear Expectations, Ministry of Education, Qingdao, China
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao, China
- School of Mathematics, Shandong University, Jinan, China
| |
Collapse
|
11
|
Chen C, Demirkhanyan L, Gondi CS. The Multifaceted Role of miR-21 in Pancreatic Cancers. Cells 2024; 13:948. [PMID: 38891080 PMCID: PMC11172074 DOI: 10.3390/cells13110948] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2024] [Revised: 05/23/2024] [Accepted: 05/27/2024] [Indexed: 06/21/2024] Open
Abstract
With the lack of specific signs and symptoms, pancreatic ductal adenocarcinoma (PDAC) is often diagnosed at late metastatic stages, resulting in poor survival outcomes. Among various biomarkers, microRNA-21 (miR-21), a small non-coding RNA, is highly expressed in PDAC. By inhibiting regulatory proteins at the 3' untranslated regions (UTR), miR-21 holds significant roles in PDAC cell proliferation, epithelial-mesenchymal transition, angiogenesis, as well as cancer invasion, metastasis, and resistance therapy. We conducted a systematic search across major databases for articles on miR-21 and pancreatic cancer mainly published within the last decade, focusing on their diagnostic, prognostic, therapeutic, and biological roles. This rigorous approach ensured a comprehensive review of miR-21's multifaceted role in pancreatic cancers. In this review, we explore the current understandings and future directions regarding the regulation, diagnostic, prognostic, and therapeutic potential of targeting miR-21 in PDAC. This exhaustive review discusses the involvement of miR-21 in proliferation, epithelial-mesenchymal transition (EMT), apoptosis modulation, angiogenesis, and its role in therapy resistance. Also discussed in the review is the interplay between various molecular pathways that contribute to tumor progression, with specific reference to pancreatic ductal adenocarcinoma.
Collapse
Affiliation(s)
- Clare Chen
- Department of Internal Medicine, University of Illinois College of Medicine Peoria, Peoria, IL 61605, USA
| | - Lusine Demirkhanyan
- Department of Internal Medicine, University of Illinois College of Medicine Peoria, Peoria, IL 61605, USA
- Departments of Internal Medicine and Surgery, University of Illinois College of Medicine Peoria, Peoria, IL 61605, USA
| | - Christopher S. Gondi
- Department of Internal Medicine, University of Illinois College of Medicine Peoria, Peoria, IL 61605, USA
- Departments of Internal Medicine and Surgery, University of Illinois College of Medicine Peoria, Peoria, IL 61605, USA
- Departments of Internal Medicine, Surgery, and Health Science Education and Pathology, University of Illinois College of Medicine Peoria, Peoria, IL 61605, USA
- Health Care Engineering Systems Center, The Grainger College of Engineering, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| |
Collapse
|
12
|
Sommer-Trembo C, Santos ME, Clark B, Werner M, Fages A, Matschiner M, Hornung S, Ronco F, Oliver C, Garcia C, Tschopp P, Malinsky M, Salzburger W. The genetics of niche-specific behavioral tendencies in an adaptive radiation of cichlid fishes. Science 2024; 384:470-475. [PMID: 38662824 DOI: 10.1126/science.adj9228] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Accepted: 03/12/2024] [Indexed: 05/03/2024]
Abstract
Behavior is critical for animal survival and reproduction, and possibly for diversification and evolutionary radiation. However, the genetics behind adaptive variation in behavior are poorly understood. In this work, we examined a fundamental and widespread behavioral trait, exploratory behavior, in one of the largest adaptive radiations on Earth, the cichlid fishes of Lake Tanganyika. By integrating quantitative behavioral data from 57 cichlid species (702 wild-caught individuals) with high-resolution ecomorphological and genomic information, we show that exploratory behavior is linked to macrohabitat niche adaptations in Tanganyikan cichlids. Furthermore, we uncovered a correlation between the genotypes at a single-nucleotide polymorphism upstream of the AMPA glutamate-receptor regulatory gene cacng5b and variation in exploratory tendency. We validated this association using behavioral predictions with a neural network approach and CRISPR-Cas9 genome editing.
Collapse
Affiliation(s)
- Carolin Sommer-Trembo
- Zoological Institute, Department of Environmental Sciences, University of Basel, Basel, Switzerland
| | - M Emília Santos
- Department of Zoology, University of Cambridge, Cambridge, UK
| | - Bethan Clark
- Department of Zoology, University of Cambridge, Cambridge, UK
| | - Marco Werner
- Leibniz-Institute for Polymer Research Dresden, Dresden, Germany
| | - Antoine Fages
- Zoological Institute, Department of Environmental Sciences, University of Basel, Basel, Switzerland
| | | | - Simon Hornung
- Zoological Institute, Department of Environmental Sciences, University of Basel, Basel, Switzerland
| | - Fabrizia Ronco
- Zoological Institute, Department of Environmental Sciences, University of Basel, Basel, Switzerland
- Natural History Museum, University of Oslo, Oslo, Norway
| | - Chantal Oliver
- Zoological Institute, Department of Environmental Sciences, University of Basel, Basel, Switzerland
| | - Cody Garcia
- Zoological Institute, Department of Environmental Sciences, University of Basel, Basel, Switzerland
| | - Patrick Tschopp
- Zoological Institute, Department of Environmental Sciences, University of Basel, Basel, Switzerland
| | - Milan Malinsky
- Department of Biology, Institute of Ecology and Evolution, University of Bern, Bern, Switzerland
| | - Walter Salzburger
- Zoological Institute, Department of Environmental Sciences, University of Basel, Basel, Switzerland
| |
Collapse
|
13
|
Yuan Y, Li H, Sreeram K, Malankhanova T, Boddu R, Strader S, Chang A, Bryant N, Yacoubian TA, Standaert DG, Erb M, Moore DJ, Sanders LH, Lutz MW, Velmeshev D, West AB. Single molecule array measures of LRRK2 kinase activity in serum link Parkinson's disease severity to peripheral inflammation. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.15.589570. [PMID: 38659797 PMCID: PMC11042295 DOI: 10.1101/2024.04.15.589570] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/26/2024]
Abstract
Background LRRK2-targeting therapeutics that inhibit LRRK2 kinase activity have advanced to clinical trials in idiopathic Parkinson's disease (iPD). LRRK2 phosphorylates Rab10 on endolysosomes in phagocytic cells to promote some types of immunological responses. The identification of factors that regulate LRRK2-mediated Rab10 phosphorylation in iPD, and whether phosphorylated-Rab10 levels change in different disease states, or with disease progression, may provide insights into the role of Rab10 phosphorylation in iPD and help guide therapeutic strategies targeting this pathway. Methods Capitalizing on past work demonstrating LRRK2 and phosphorylated-Rab10 interact on vesicles that can shed into biofluids, we developed and validated a high-throughput single-molecule array assay to measure extracellular pT73-Rab10. Ratios of pT73-Rab10 to total Rab10 measured in biobanked serum samples were compared between informative groups of transgenic mice, rats, and a deeply phenotyped cohort of iPD cases and controls. Multivariable and weighted correlation network analyses were used to identify genetic, transcriptomic, clinical, and demographic variables that predict the extracellular pT73-Rab10 to total Rab10 ratio. Results pT73-Rab10 is absent in serum from Lrrk2 knockout mice but elevated by LRRK2 and VPS35 mutations, as well as SNCA expression. Bone-marrow transplantation experiments in mice show that serum pT73-Rab10 levels derive primarily from circulating immune cells. The extracellular ratio of pT73-Rab10 to total Rab10 is dynamic, increasing with inflammation and rapidly decreasing with LRRK2 kinase inhibition. The ratio of pT73-Rab10 to total Rab10 is elevated in iPD patients with greater motor dysfunction, irrespective of disease duration, age, sex, or the usage of PD-related or anti-inflammatory medications. pT73-Rab10 to total Rab10 ratios are associated with neutrophil activation, antigenic responses, and the suppression of platelet activation. Conclusions The extracellular ratio of pT73-Rab10 to total Rab10 in serum is a novel pharmacodynamic biomarker for LRRK2-linked innate immune activation associated with disease severity in iPD. We propose that those iPD patients with higher serum pT73-Rab10 levels may benefit from LRRK2-targeting therapeutics to mitigate associated deleterious immunological responses.
Collapse
|
14
|
Wang B, Sun F, Luan Y. Comparison of the effectiveness of different normalization methods for metagenomic cross-study phenotype prediction under heterogeneity. Sci Rep 2024; 14:7024. [PMID: 38528097 DOI: 10.1038/s41598-024-57670-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Accepted: 03/20/2024] [Indexed: 03/27/2024] Open
Abstract
The human microbiome, comprising microorganisms residing within and on the human body, plays a crucial role in various physiological processes and has been linked to numerous diseases. To analyze microbiome data, it is essential to account for inherent heterogeneity and variability across samples. Normalization methods have been proposed to mitigate these variations and enhance comparability. However, the performance of these methods in predicting binary phenotypes remains understudied. This study systematically evaluates different normalization methods in microbiome data analysis and their impact on disease prediction. Our findings highlight the strengths and limitations of scaling, compositional data analysis, transformation, and batch correction methods. Scaling methods like TMM show consistent performance, while compositional data analysis methods exhibit mixed results. Transformation methods, such as Blom and NPN, demonstrate promise in capturing complex associations. Batch correction methods, including BMC and Limma, consistently outperform other approaches. However, the influence of normalization methods is constrained by population effects, disease effects, and batch effects. These results provide insights for selecting appropriate normalization approaches in microbiome research, improving predictive models, and advancing personalized medicine. Future research should explore larger and more diverse datasets and develop tailored normalization strategies for microbiome data analysis.
Collapse
Affiliation(s)
- Beibei Wang
- Frontier Science Center for Nonlinear Expectations, Ministry of Education, Qingdao, 266237, China
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao, 266237, China
- School of Mathematics, Shandong University, Jinan, 250100, China
| | - Fengzhu Sun
- Quantitative and Computational Biology Department, University of Southern California, Los Angeles, 90089, USA
| | - Yihui Luan
- Frontier Science Center for Nonlinear Expectations, Ministry of Education, Qingdao, 266237, China.
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao, 266237, China.
- School of Mathematics, Shandong University, Jinan, 250100, China.
| |
Collapse
|
15
|
Cao THM, Le APH, Tran TT, Huynh VK, Pham BH, Le TM, Nguyen QL, Tran TC, Tong TM, Than THN, Nguyen TTT, Ha HTT. Plasma cell-free RNA profiling of Vietnamese Alzheimer's patients reveals a linkage with chronic inflammation and apoptosis: a pilot study. Front Mol Neurosci 2023; 16:1308610. [PMID: 38178908 PMCID: PMC10764507 DOI: 10.3389/fnmol.2023.1308610] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Accepted: 12/04/2023] [Indexed: 01/06/2024] Open
Abstract
Introduction Circulating cell-free RNA (cfRNA) is a potential hallmark for early diagnosis of Alzheimer's Disease (AD) as it construes the genetic expression level, giving insights into the pathological progress from the outset. Profiles of cfRNA in Caucasian AD patients have been investigated thoroughly, yet there was no report exploring cfRNAs in the ASEAN groups. This study examined the gap, expecting to support the development of point-of-care AD diagnosis. Methods cfRNA profiles were characterized from 20 Vietnamese plasma samples (10 probable AD and 10 age-matched controls). RNA reads were subjected to differential expression (DE) analysis. Weighted gene correlation network analysis (WGCNA) was performed to identify gene modules that were significantly co-expressed. These modules' expression profiles were then correlated with AD status to identify relevant modules. Genes with the highest intramodular connectivity (module membership) were selected as hub genes. Transcript counts of differentially expressed genes were correlated with key AD measures-MMSE and MTA scores-to identify potential biomarkers. Results 136 genes were identified as significant AD hallmarks (p < 0.05), with 52 downregulated and 84 upregulated in the AD cohort. 45.6% of these genes are highly expressed in the hippocampus, cerebellum, and cerebral cortex. Notably, all markers related to chronic inflammation were upregulated, and there was a significant shift in all apoptotic markers. Three co-expressed modules were found to be significantly correlated with Alzheimer's status (p < 0.05; R2> 0.5). Functional enrichment analysis on these modules reveals an association with focal adhesion, nucleocytoplasmic transport, and metal ion response leading to apoptosis, suggesting the potential participation of these pathways in AD pathology. 47 significant hub genes were found to be differentially expressed genes with the highest connectivity. Six significant hub genes (CREB1, YTHDC1, IL1RL1, PHACTR2, ANKRD36B, RNF213) were found to be significantly correlated with MTA and MMSE scores. Other significant transcripts (XRN1, UBB, CHP1, THBS1, S100A9) were found to be involved in inflammation and neuronal death. Overall, we have identified candidate transcripts in plasma cf-RNA that are differentially expressed and are implicated in inflammation and apoptosis, which can jumpstart further investigations into applying cf-RNA as an AD biomarker in Vietnam and ASEAN countries.
Collapse
Affiliation(s)
- Thien Hoang Minh Cao
- School of Biomedical Engineering, International University, Ho Chi Minh City, Vietnam
- Vietnam National University, Ho Chi Minh City, Vietnam
| | - Anh Phuc Hoang Le
- School of Biomedical Engineering, International University, Ho Chi Minh City, Vietnam
- Vietnam National University, Ho Chi Minh City, Vietnam
| | - Tai Tien Tran
- Department of Physiology, Pathophysiology and Immunology, Pham Ngoc Thach University of Medicine, Ho Chi Minh City, Vietnam
| | - Vy Kim Huynh
- School of Biomedical Engineering, International University, Ho Chi Minh City, Vietnam
- Vietnam National University, Ho Chi Minh City, Vietnam
| | - Bao Hoai Pham
- School of Biomedical Engineering, International University, Ho Chi Minh City, Vietnam
- Vietnam National University, Ho Chi Minh City, Vietnam
| | - Thao Mai Le
- School of Biomedical Engineering, International University, Ho Chi Minh City, Vietnam
- Vietnam National University, Ho Chi Minh City, Vietnam
| | - Quang Lam Nguyen
- School of Biomedical Engineering, International University, Ho Chi Minh City, Vietnam
- Vietnam National University, Ho Chi Minh City, Vietnam
| | - Thang Cong Tran
- Department of Neurology, Faculty of Medicine, University of Medicine and Pharmacy at Ho Chi Minh City, Ho Chi Minh City, Vietnam
| | - Trang Mai Tong
- Department of Neurology, University Medical Center, Ho Chi Minh City, Vietnam
| | - The Ha Ngoc Than
- Department of Geriatrics, Faculty of Medicine, University of Medicine and Pharmacy at Ho Chi Minh City, Ho Chi Minh City, Vietnam
- Department of Geriatrics and Palliative Care, University Medical Center, Ho Chi Minh City, Vietnam
| | - Tran Tran To Nguyen
- Department of Geriatrics, Faculty of Medicine, University of Medicine and Pharmacy at Ho Chi Minh City, Ho Chi Minh City, Vietnam
| | - Huong Thi Thanh Ha
- School of Biomedical Engineering, International University, Ho Chi Minh City, Vietnam
- Vietnam National University, Ho Chi Minh City, Vietnam
| |
Collapse
|
16
|
Margoliash J, Fuchs S, Li Y, Zhang X, Massarat A, Goren A, Gymrek M. Polymorphic short tandem repeats make widespread contributions to blood and serum traits. CELL GENOMICS 2023; 3:100458. [PMID: 38116119 PMCID: PMC10726533 DOI: 10.1016/j.xgen.2023.100458] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/21/2022] [Revised: 09/09/2023] [Accepted: 11/07/2023] [Indexed: 12/21/2023]
Abstract
Short tandem repeats (STRs) are genomic regions consisting of repeated sequences of 1-6 bp in succession. Single-nucleotide polymorphism (SNP)-based genome-wide association studies (GWASs) do not fully capture STR effects. To study these effects, we imputed 445,720 STRs into genotype arrays from 408,153 White British UK Biobank participants and tested for association with 44 blood phenotypes. Using two fine-mapping methods, we identify 119 candidate causal STR-trait associations and estimate that STRs account for 5.2%-7.6% of causal variants identifiable from GWASs for these traits. These are among the strongest associations for multiple phenotypes, including a coding CTG repeat associated with apolipoprotein B levels, a promoter CGG repeat with platelet traits, and an intronic poly(A) repeat with mean platelet volume. Our study suggests that STRs make widespread contributions to complex traits, provides stringently selected candidate causal STRs, and demonstrates the need to consider a more complete view of genetic variation in GWASs.
Collapse
Affiliation(s)
- Jonathan Margoliash
- Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA 92093, USA
| | - Shai Fuchs
- Pediatric Endocrine and Diabetes Unit, Edmond and Lily Safra Children's Hospital, Sheba Medical Center, Ramat Gan, Israel
| | - Yang Li
- Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA 92093, USA; Department of Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Xuan Zhang
- Department of Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Arya Massarat
- Bioinformatics and Systems Biology Program, University of California, San Diego, La Jolla, CA 92093, USA
| | - Alon Goren
- Department of Medicine, University of California, San Diego, La Jolla, CA 92093, USA.
| | - Melissa Gymrek
- Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA 92093, USA; Department of Medicine, University of California, San Diego, La Jolla, CA 92093, USA.
| |
Collapse
|
17
|
Edrisi M, Huang X, Ogilvie HA, Nakhleh L. Accurate integration of single-cell DNA and RNA for analyzing intratumor heterogeneity using MaCroDNA. Nat Commun 2023; 14:8262. [PMID: 38092737 PMCID: PMC10719311 DOI: 10.1038/s41467-023-44014-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Accepted: 11/27/2023] [Indexed: 12/17/2023] Open
Abstract
Cancers develop and progress as mutations accumulate, and with the advent of single-cell DNA and RNA sequencing, researchers can observe these mutations and their transcriptomic effects and predict proteomic changes with remarkable temporal and spatial precision. However, to connect genomic mutations with their transcriptomic and proteomic consequences, cells with either only DNA data or only RNA data must be mapped to a common domain. For this purpose, we present MaCroDNA, a method that uses maximum weighted bipartite matching of per-gene read counts from single-cell DNA and RNA-seq data. Using ground truth information from colorectal cancer data, we demonstrate the advantage of MaCroDNA over existing methods in accuracy and speed. Exemplifying the utility of single-cell data integration in cancer research, we suggest, based on results derived using MaCroDNA, that genomic mutations of large effect size increasingly contribute to differential expression between cells as Barrett's esophagus progresses to esophageal cancer, reaffirming the findings of the previous studies.
Collapse
Affiliation(s)
| | - Xiru Huang
- Department of Computer Science, Rice University, Houston, Texas, USA
| | - Huw A Ogilvie
- Department of Computer Science, Rice University, Houston, Texas, USA.
| | - Luay Nakhleh
- Department of Computer Science, Rice University, Houston, Texas, USA.
| |
Collapse
|
18
|
Dimayacyac JR, Wu S, Jiang D, Pennell M. Evaluating the Performance of Widely Used Phylogenetic Models for Gene Expression Evolution. Genome Biol Evol 2023; 15:evad211. [PMID: 38000902 PMCID: PMC10709115 DOI: 10.1093/gbe/evad211] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2023] [Revised: 11/09/2023] [Accepted: 11/17/2023] [Indexed: 11/26/2023] Open
Abstract
Phylogenetic comparative methods are increasingly used to test hypotheses about the evolutionary processes that drive divergence in gene expression among species. However, it is unknown whether the distributional assumptions of phylogenetic models designed for quantitative phenotypic traits are realistic for expression data and importantly, the reliability of conclusions of phylogenetic comparative studies of gene expression may depend on whether the data is well described by the chosen model. To evaluate this, we first fit several phylogenetic models of trait evolution to 8 previously published comparative expression datasets, comprising a total of 54,774 genes with 145,927 unique gene-tissue combinations. Using a previously developed approach, we then assessed how well the best model of the set described the data in an absolute (not just relative) sense. First, we find that Ornstein-Uhlenbeck models, in which expression values are constrained around an optimum, were the preferred models for 66% of gene-tissue combinations. Second, we find that for 61% of gene-tissue combinations, the best-fit model of the set was found to perform well; the rest were found to be performing poorly by at least one of the test statistics we examined. Third, we find that when simple models do not perform well, this appears to be typically a consequence of failing to fully account for heterogeneity in the rate of the evolution. We advocate that assessment of model performance should become a routine component of phylogenetic comparative expression studies; doing so can improve the reliability of inferences and inspire the development of novel models.
Collapse
Affiliation(s)
- Jose Rafael Dimayacyac
- Department of Zoology, University of British Columbia, Vancouver, BC, Canada
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, Canada
| | - Shanyun Wu
- Department of Zoology, University of British Columbia, Vancouver, BC, Canada
- Department of Developmental Biology, Washington University School of Medicine in St. Louis, St. Louis, MO, USA
| | - Daohan Jiang
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
| | - Matt Pennell
- Department of Zoology, University of British Columbia, Vancouver, BC, Canada
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
- Department of Biological Sciences, University of Southern California, Los Angeles, CA, USA
| |
Collapse
|
19
|
Padwal MK, Basu S, Basu B. Application of Machine Learning in Predicting Hepatic Metastasis or Primary Site in Gastroenteropancreatic Neuroendocrine Tumors. Curr Oncol 2023; 30:9244-9261. [PMID: 37887568 PMCID: PMC10605255 DOI: 10.3390/curroncol30100668] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Revised: 10/16/2023] [Accepted: 10/16/2023] [Indexed: 10/28/2023] Open
Abstract
Gastroenteropancreatic neuroendocrine tumors (GEP-NETs) account for 80% of gastroenteropancreatic neuroendocrine neoplasms (GEP-NENs). GEP-NETs are well-differentiated tumors, highly heterogeneous in biology and origin, and are often diagnosed at the metastatic stage. Diagnosis is commonly through clinical symptoms, histopathology, and PET-CT imaging, while molecular markers for metastasis and the primary site are unknown. Here, we report the identification of multi-gene signatures for hepatic metastasis and primary sites through analyses on RNA-SEQ datasets of pancreatic and small intestinal NETs tissue samples. Relevant gene features, identified from the normalized RNA-SEQ data using the mRMRe algorithm, were used to develop seven Machine Learning models (LDA, RF, CART, k-NN, SVM, XGBOOST, GBM). Two multi-gene random forest (RF) models classified primary and metastatic samples with 100% accuracy in training and test cohorts and >90% accuracy in an independent validation cohort. Similarly, three multi-gene RF models identified the pancreas or small intestine as the primary site with 100% accuracy in training and test cohorts, and >95% accuracy in an independent cohort. Multi-label models for concurrent prediction of hepatic metastasis and primary site returned >98.42% and >87.42% accuracies on training and test cohorts, respectively. A robust molecular signature to predict liver metastasis or the primary site for GEP-NETs is reported for the first time and could complement the clinical management of GEP-NETs.
Collapse
Affiliation(s)
- Mahesh Kumar Padwal
- Molecular Biology Division, Bhabha Atomic Research Centre, Mumbai 400085, India;
- Homi Bhabha National Institute, Mumbai 400094, India;
| | - Sandip Basu
- Homi Bhabha National Institute, Mumbai 400094, India;
- Radiation Medicine Centre, Bhabha Atomic Research Centre, Tata Memorial Hospital Annexe, Mumbai 400012, India
| | - Bhakti Basu
- Molecular Biology Division, Bhabha Atomic Research Centre, Mumbai 400085, India;
- Homi Bhabha National Institute, Mumbai 400094, India;
| |
Collapse
|
20
|
Wu HW, Wu JD, Yeh YP, Wu TH, Chao CH, Wang W, Chen TW. DoSurvive: A webtool for investigating the prognostic power of a single or combined cancer biomarker. iScience 2023; 26:107269. [PMID: 37609633 PMCID: PMC10440714 DOI: 10.1016/j.isci.2023.107269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Revised: 05/26/2023] [Accepted: 06/28/2023] [Indexed: 08/24/2023] Open
Abstract
We present DoSurvive, a user-friendly survival analysis web tool and a cancer prognostic biomarker centered database. DoSurvive is the first database that allows users to perform multivariant survival analysis for cancers with customized gene/patient list. DoSurvive offers three survival analysis methods, Log rank test, Cox regression and accelerated failure time model (AFT), for users to analyze five types of quantitative features (mRNA, miRNA, lncRNA, protein and methylation of CpG islands) with four survival types, i.e. overall survival, disease-specific survival, disease-free interval, and progression-free interval, in 33 cancer types. Notably, the implemented AFT model provides an alternative method for genes/features which failed the proportional hazard assumption in Cox regression. With the unprecedented number of survival models implemented and high flexibility in analysis, DoSurvive is a unique platform for the identification of clinically relevant targets for cancer researcher and practitioners. DoSurvive is freely available at http://dosurvive.lab.nycu.edu.tw/.
Collapse
Affiliation(s)
- Hao-Wei Wu
- Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu 30068, Taiwan
| | - Jian-De Wu
- Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu 30068, Taiwan
| | - Yen-Ping Yeh
- Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu 30068, Taiwan
| | - Timothy H. Wu
- Institute of Ecology and Evolutionary Biology, National Taiwan University, Taipei 10617, Taiwan
| | - Chi-Hong Chao
- Institute of Molecular Medicine and Bioengineering, National Yang Ming Chiao Tung University, Hsinchu 30068, Taiwan
- Department of Biological Science and Technology, National Yang Ming Chiao Tung University, Hsinchu 30068, Taiwan
- Center For Intelligent Drug Systems and Smart Bio-devices (IDSB), National Yang Ming Chiao Tung University, Hsinchu 30068, Taiwan
| | - Weijing Wang
- Institute of Statistics, National Yang Ming Chiao Tung University, Hsinchu 30068, Taiwan
| | - Ting-Wen Chen
- Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu 30068, Taiwan
- Department of Biological Science and Technology, National Yang Ming Chiao Tung University, Hsinchu 30068, Taiwan
- Center For Intelligent Drug Systems and Smart Bio-devices (IDSB), National Yang Ming Chiao Tung University, Hsinchu 30068, Taiwan
| |
Collapse
|
21
|
Dimayacyac JR, Wu S, Jiang D, Pennell M. Evaluating the Performance of Widely Used Phylogenetic Models for Gene Expression Evolution. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.09.527893. [PMID: 37645857 PMCID: PMC10461906 DOI: 10.1101/2023.02.09.527893] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/31/2023]
Abstract
Phylogenetic comparative methods are increasingly used to test hypotheses about the evolutionary processes that drive divergence in gene expression among species. However, it is unknown whether the distributional assumptions of phylogenetic models designed for quantitative phenotypic traits are realistic for expression data and importantly, the reliability of conclusions of phylogenetic comparative studies of gene expression may depend on whether the data is well-described by the chosen model. To evaluate this, we first fit several phylogenetic models of trait evolution to 8 previously published comparative expression datasets, comprising a total of 54,774 genes with 145,927 unique gene-tissue combinations. Using a previously developed approach, we then assessed how well the best model of the set described the data in an absolute (not just relative) sense. First, we find that Ornstein-Uhlenbeck models, in which expression values are constrained around an optimum, were the preferred model for 66% of gene-tissue combinations. Second, we find that for 61% of gene-tissue combinations, the best fit model of the set was found to perform well; the rest were found to be performing poorly by at least one of the test statistics we examined. Third, we find that when simple models do not perform well, this appears to be typically a consequence of failing to fully account for heterogeneity in the rate of the evolution. We advocate that assessment of model performance should become a routine component of phylogenetic comparative expression studies; doing so can improve the reliability of inferences and inspire the development of novel models.
Collapse
Affiliation(s)
- Jose Rafael Dimayacyac
- Department of Zoology, University of British Columbia, Canada
- Michael Smith Laboratories, University of British Columbia, Canada
| | - Shanyun Wu
- Department of Zoology, University of British Columbia, Canada
- Department of Genetics, Washington University School of Medicine, USA
| | - Daohan Jiang
- Department of Quantitative and Computational Biology, University of Southern California, USA
| | - Matt Pennell
- Department of Zoology, University of British Columbia, Canada
- Department of Quantitative and Computational Biology, University of Southern California, USA
- Department of Biological Sciences, University of Southern California, USA
| |
Collapse
|
22
|
Marrella MA, Biase FH. Robust identification of regulatory variants (eQTLs) using a differential expression framework developed for RNA-sequencing. J Anim Sci Biotechnol 2023; 14:62. [PMID: 37143150 PMCID: PMC10161580 DOI: 10.1186/s40104-023-00861-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Accepted: 03/05/2023] [Indexed: 05/06/2023] Open
Abstract
BACKGROUND A gap currently exists between genetic variants and the underlying cell and tissue biology of a trait, and expression quantitative trait loci (eQTL) studies provide important information to help close that gap. However, two concerns that arise with eQTL analyses using RNA-sequencing data are normalization of data across samples and the data not following a normal distribution. Multiple pipelines have been suggested to address this. For instance, the most recent analysis of the human and farm Genotype-Tissue Expression (GTEx) project proposes using trimmed means of M-values (TMM) to normalize the data followed by an inverse normal transformation. RESULTS In this study, we reasoned that eQTL analysis could be carried out using the same framework used for differential gene expression (DGE), which uses a negative binomial model, a statistical test feasible for count data. Using the GTEx framework, we identified 35 significant eQTLs (P < 5 × 10-8) following the ANOVA model and 39 significant eQTLs (P < 5 × 10-8) following the additive model. Using a differential gene expression framework, we identified 930 and six significant eQTLs (P < 5 × 10-8) following an analytical framework equivalent to the ANOVA and additive model, respectively. When we compared the two approaches, there was no overlap of significant eQTLs between the two frameworks. Because we defined specific contrasts, we identified trans eQTLs that more closely resembled what we expect from genetic variants showing complete dominance between alleles. Yet, these were not identified by the GTEx framework. CONCLUSIONS Our results show that transforming RNA-sequencing data to fit a normal distribution prior to eQTL analysis is not required when the DGE framework is employed. Our proposed approach detected biologically relevant variants that otherwise would not have been identified due to data transformation to fit a normal distribution.
Collapse
Affiliation(s)
- Mackenzie A Marrella
- School of Animal Sciences, Virginia Polytechnic Institute and State University, Blacksburg, VA, USA
| | - Fernando H Biase
- School of Animal Sciences, Virginia Polytechnic Institute and State University, Blacksburg, VA, USA.
| |
Collapse
|
23
|
Mondol RK, Millar EKA, Graham PH, Browne L, Sowmya A, Meijering E. hist2RNA: An Efficient Deep Learning Architecture to Predict Gene Expression from Breast Cancer Histopathology Images. Cancers (Basel) 2023; 15:2569. [PMID: 37174035 PMCID: PMC10177559 DOI: 10.3390/cancers15092569] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2023] [Revised: 04/23/2023] [Accepted: 04/28/2023] [Indexed: 05/15/2023] Open
Abstract
Gene expression can be used to subtype breast cancer with improved prediction of risk of recurrence and treatment responsiveness over that obtained using routine immunohistochemistry (IHC). However, in the clinic, molecular profiling is primarily used for ER+ breast cancer, which is costly, tissue destructive, requires specialised platforms, and takes several weeks to obtain a result. Deep learning algorithms can effectively extract morphological patterns in digital histopathology images to predict molecular phenotypes quickly and cost-effectively. We propose a new, computationally efficient approach called hist2RNA inspired by bulk RNA sequencing techniques to predict the expression of 138 genes (incorporated from 6 commercially available molecular profiling tests), including luminal PAM50 subtype, from hematoxylin and eosin (H&E)-stained whole slide images (WSIs). The training phase involves the aggregation of extracted features for each patient from a pretrained model to predict gene expression at the patient level using annotated H&E images from The Cancer Genome Atlas (TCGA, n = 335). We demonstrate successful gene prediction on a held-out test set (n = 160, corr = 0.82 across patients, corr = 0.29 across genes) and perform exploratory analysis on an external tissue microarray (TMA) dataset (n = 498) with known IHC and survival information. Our model is able to predict gene expression and luminal PAM50 subtype (Luminal A versus Luminal B) on the TMA dataset with prognostic significance for overall survival in univariate analysis (c-index = 0.56, hazard ratio = 2.16 (95% CI 1.12-3.06), p < 5 × 10-3), and independent significance in multivariate analysis incorporating standard clinicopathological variables (c-index = 0.65, hazard ratio = 1.87 (95% CI 1.30-2.68), p < 5 × 10-3). The proposed strategy achieves superior performance while requiring less training time, resulting in less energy consumption and computational cost compared to patch-based models. Additionally, hist2RNA predicts gene expression that has potential to determine luminal molecular subtypes which correlates with overall survival, without the need for expensive molecular testing.
Collapse
Affiliation(s)
- Raktim Kumar Mondol
- School of Computer Science and Engineering, UNSW Sydney, Kensington, NSW 2052, Australia; (R.K.M.); (A.S.)
| | - Ewan K. A. Millar
- Department of Anatomical Pathology, NSW Health Pathology, St. George Hospital, Kogarah, NSW 2217, Australia;
- St. George and Sutherland Clinical School, UNSW Sydney, Kensington, NSW 2052, Australia;
- Faculty of Medicine and Health Sciences, Sydney Western University, Campbelltown, NSW 2560, Australia
- University of Technology Sydney, Ultimo, NSW 2007, Australia
| | - Peter H. Graham
- St. George and Sutherland Clinical School, UNSW Sydney, Kensington, NSW 2052, Australia;
- Cancer Care Centre, St George Hospital, Sydney, NSW 2217, Australia;
| | - Lois Browne
- Cancer Care Centre, St George Hospital, Sydney, NSW 2217, Australia;
| | - Arcot Sowmya
- School of Computer Science and Engineering, UNSW Sydney, Kensington, NSW 2052, Australia; (R.K.M.); (A.S.)
| | - Erik Meijering
- School of Computer Science and Engineering, UNSW Sydney, Kensington, NSW 2052, Australia; (R.K.M.); (A.S.)
| |
Collapse
|
24
|
Oh S, Liu X, Tomei S, Luo M, Skinner JP, Berzins SP, Naik SH, Gray DHD, Chong MMW. Distinct subpopulations of DN1 thymocytes exhibit preferential γδ T lineage potential. Front Immunol 2023; 14:1106652. [PMID: 37077921 PMCID: PMC10106834 DOI: 10.3389/fimmu.2023.1106652] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2022] [Accepted: 03/21/2023] [Indexed: 04/05/2023] Open
Abstract
The αβ and γδ T cell lineages both differentiate in the thymus from common uncommitted progenitors. The earliest stage of T cell development is known as CD4-CD8- double negative 1 (DN1), which has previously been shown to be a heterogenous mixture of cells. Of these, only the CD117+ fraction has been proposed to be true T cell progenitors that progress to the DN2 and DN3 thymocyte stages, at which point the development of the αβ and γδ T cell lineages diverge. However, recently, it has been shown that at least some γδ T cells may be derived from a subset of CD117- DN thymocytes. Along with other ambiguities, this suggests that T cell development may not be as straightforward as previously thought. To better understand early T cell development, particularly the heterogeneity of DN1 thymocytes, we performed a single cell RNA sequence (scRNAseq) of mouse DN and γδ thymocytes and show that the various DN stages indeed comprise a transcriptionally diverse subpopulations of cells. We also show that multiple subpopulations of DN1 thymocytes exhibit preferential development towards the γδ lineage. Furthermore, specific γδ-primed DN1 subpopulations preferentially develop into IL-17 or IFNγ-producing γδ T cells. We show that DN1 subpopulations that only give rise to IL-17-producing γδ T cells already express many of the transcription factors associated with type 17 immune cell responses, while the DN1 subpopulations that can give rise to IFNγ-producing γδ T cell already express transcription factors associated with type 1 immune cell responses.
Collapse
Affiliation(s)
- Seungyoul Oh
- St Vincent’s Institute of Medical Research, Fitzroy, VIC, Australia
- Department of Medicine (St Vincent’s), University of Melbourne, Fitzroy, VIC, Australia
| | - Xin Liu
- St Vincent’s Institute of Medical Research, Fitzroy, VIC, Australia
| | - Sara Tomei
- The Walter and Eliza Hall Institute of Medical Research, Melbourne, VIC, Australia
- Department of Medical Biology, University of Melbourne, Melbourne, VIC, Australia
| | - Mengxiao Luo
- The Walter and Eliza Hall Institute of Medical Research, Melbourne, VIC, Australia
- Department of Medical Biology, University of Melbourne, Melbourne, VIC, Australia
| | | | - Stuart P. Berzins
- Department of Microbiology and Immunology, University of Melbourne, Melbourne, VIC, Australia
- Institute of Innovation, Science and Sustainability, Federation University Australia, Ballarat, VIC, Australia
| | - Shalin H. Naik
- The Walter and Eliza Hall Institute of Medical Research, Melbourne, VIC, Australia
- Department of Medical Biology, University of Melbourne, Melbourne, VIC, Australia
| | - Daniel H. D. Gray
- The Walter and Eliza Hall Institute of Medical Research, Melbourne, VIC, Australia
- Department of Medical Biology, University of Melbourne, Melbourne, VIC, Australia
| | - Mark M. W. Chong
- St Vincent’s Institute of Medical Research, Fitzroy, VIC, Australia
- Department of Medicine (St Vincent’s), University of Melbourne, Fitzroy, VIC, Australia
- *Correspondence: Mark M. W. Chong,
| |
Collapse
|
25
|
Le Priol C, Azencott CA, Gidrol X. Detection of genes with differential expression dispersion unravels the role of autophagy in cancer progression. PLoS Comput Biol 2023; 19:e1010342. [PMID: 36893104 PMCID: PMC9997931 DOI: 10.1371/journal.pcbi.1010342] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Accepted: 02/09/2023] [Indexed: 03/10/2023] Open
Abstract
The majority of gene expression studies focus on the search for genes whose mean expression is different between two or more populations of samples in the so-called "differential expression analysis" approach. However, a difference in variance in gene expression may also be biologically and physiologically relevant. In the classical statistical model used to analyze RNA-sequencing (RNA-seq) data, the dispersion, which defines the variance, is only considered as a parameter to be estimated prior to identifying a difference in mean expression between conditions of interest. Here, we propose to evaluate four recently published methods, which detect differences in both the mean and dispersion in RNA-seq data. We thoroughly investigated the performance of these methods on simulated datasets and characterized parameter settings to reliably detect genes with a differential expression dispersion. We applied these methods to The Cancer Genome Atlas datasets. Interestingly, among the genes with an increased expression dispersion in tumors and without a change in mean expression, we identified some key cellular functions, most of which were related to catabolism and were overrepresented in most of the analyzed cancers. In particular, our results highlight autophagy, whose role in cancerogenesis is context-dependent, illustrating the potential of the differential dispersion approach to gain new insights into biological processes and to discover new biomarkers.
Collapse
Affiliation(s)
- Christophe Le Priol
- Univ. Grenoble Alpes, INSERM, CEA-IRIG, Biomics, Grenoble, France
- * E-mail: (CLP); (XG)
| | - Chloé-Agathe Azencott
- Center for Computational Biology, Mines ParisTech, PSL Research University, Paris, France
- Institut Curie, Paris, France
- INSERM U900, Paris, France
| | - Xavier Gidrol
- Univ. Grenoble Alpes, INSERM, CEA-IRIG, Biomics, Grenoble, France
- * E-mail: (CLP); (XG)
| |
Collapse
|
26
|
Lipponen A, Kajevu N, Natunen T, Ciszek R, Puhakka N, Hiltunen M, Pitkänen A. Gene Expression Profile as a Predictor of Seizure Liability. Int J Mol Sci 2023; 24:ijms24044116. [PMID: 36835526 PMCID: PMC9963992 DOI: 10.3390/ijms24044116] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2023] [Revised: 02/14/2023] [Accepted: 02/16/2023] [Indexed: 02/22/2023] Open
Abstract
Analysis platforms to predict drug-induced seizure liability at an early phase of drug development would improve safety and reduce attrition and the high cost of drug development. We hypothesized that a drug-induced in vitro transcriptomics signature predicts its ictogenicity. We exposed rat cortical neuronal cultures to non-toxic concentrations of 34 compounds for 24 h; 11 were known to be ictogenic (tool compounds), 13 were associated with a high number of seizure-related adverse event reports in the clinical FDA Adverse Event Reporting System (FAERS) database and systematic literature search (FAERS-positive compounds), and 10 were known to be non-ictogenic (FAERS-negative compounds). The drug-induced gene expression profile was assessed from RNA-sequencing data. Transcriptomics profiles induced by the tool, FAERS-positive and FAERS-negative compounds, were compared using bioinformatics and machine learning. Of the 13 FAERS-positive compounds, 11 induced significant differential gene expression; 10 of the 11 showed an overall high similarity to the profile of at least one tool compound, correctly predicting the ictogenicity. Alikeness-% based on the number of the same differentially expressed genes correctly categorized 85%, the Gene Set Enrichment Analysis score correctly categorized 73%, and the machine-learning approach correctly categorized 91% of the FAERS-positive compounds with reported seizure liability currently in clinical use. Our data suggest that the drug-induced gene expression profile could be used as a predictive biomarker for seizure liability.
Collapse
Affiliation(s)
- Anssi Lipponen
- A. I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, P.O. Box 1627, FIN-70211 Kuopio, Finland
- Expert Microbiology Unit, Finnish Institute for Health and Welfare, P.O. Box 95, FIN-70701 Kuopio, Finland
| | - Natallie Kajevu
- A. I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, P.O. Box 1627, FIN-70211 Kuopio, Finland
| | - Teemu Natunen
- Institute of Biomedicine, University of Eastern Finland, P.O. Box 1627, FIN-70211 Kuopio, Finland
| | - Robert Ciszek
- A. I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, P.O. Box 1627, FIN-70211 Kuopio, Finland
| | - Noora Puhakka
- A. I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, P.O. Box 1627, FIN-70211 Kuopio, Finland
| | - Mikko Hiltunen
- Institute of Biomedicine, University of Eastern Finland, P.O. Box 1627, FIN-70211 Kuopio, Finland
| | - Asla Pitkänen
- A. I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, P.O. Box 1627, FIN-70211 Kuopio, Finland
- Correspondence: ; Tel.: +358-50-517-2091; Fax: +358-17-16-3030
| |
Collapse
|
27
|
Altay G, Zapardiel-Gonzalo J, Peters B. RNA-seq preprocessing and sample size considerations for gene network inference. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.02.522518. [PMID: 36711979 PMCID: PMC9881880 DOI: 10.1101/2023.01.02.522518] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
Background Gene network inference (GNI) methods have the potential to reveal functional relationships between different genes and their products. Most GNI algorithms have been developed for microarray gene expression datasets and their application to RNA-seq data is relatively recent. As the characteristics of RNA-seq data are different from microarray data, it is an unanswered question what preprocessing methods for RNA-seq data should be applied prior to GNI to attain optimal performance, or what the required sample size for RNA-seq data is to obtain reliable GNI estimates. Results We ran 9144 analysis of 7 different RNA-seq datasets to evaluate 300 different preprocessing combinations that include data transformations, normalizations and association estimators. We found that there was no single best performing preprocessing combination but that there were several good ones. The performance varied widely over various datasets, which emphasized the importance of choosing an appropriate preprocessing configuration before GNI. Two preprocessing combinations appeared promising in general: First, Log-2 TPM (transcript per million) with Variance-stabilizing transformation (VST) and Pearson Correlation Coefficient (PCC) association estimator. Second, raw RNA-seq count data with PCC. Along with these two, we also identified 18 other good preprocessing combinations. Any of these algorithms might perform best in different datasets. Therefore, the GNI performances of these approaches should be measured on any new dataset to select the best performing one for it. In terms of the required biological sample size of RNA-seq data, we found that between 30 to 85 samples were required to generate reliable GNI estimates. Conclusions This study provides practical recommendations on default choices for data preprocessing prior to GNI analysis of RNA-seq data to obtain optimal performance results.
Collapse
Affiliation(s)
- Gökmen Altay
- La Jolla Institute for Immunology, 9420 Athena Circle, La Jolla, CA 92037, USA
| | | | - Bjoern Peters
- La Jolla Institute for Immunology, 9420 Athena Circle, La Jolla, CA 92037, USA
| |
Collapse
|
28
|
Wang YW, Lin WY, Wu FJ, Luo CW. Unveiling the transcriptomic landscape and the potential antagonist feedback mechanisms of TGF-β superfamily signaling module in bone and osteoporosis. Cell Commun Signal 2022; 20:190. [PMID: 36443839 PMCID: PMC9703672 DOI: 10.1186/s12964-022-01002-2] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2022] [Accepted: 10/22/2022] [Indexed: 11/29/2022] Open
Abstract
BACKGROUND TGF-β superfamily signaling is indispensable for bone homeostasis. However, the global expression profiles of all the genes that make up this signaling module in bone and bone-related diseases have not yet been well characterized. METHODS Transcriptomic datasets from human bone marrows, bone marrow-derived mesenchymal stem cells (MSCs) and MSCs of primary osteoporotic patients were used for expression profile analyses. Protein treatments, gene quantification, reporter assay and signaling dissection in MSC lines were used to clarify the interactive regulations and feedback mechanisms between TGF-β superfamily ligands and antagonists. Ingenuity Pathway Analysis was used for network construction. RESULTS We identified TGFB1 in the ligand group that carries out SMAD2/3 signaling and BMP8A, BMP8B and BMP2 in the ligand group that conducts SMAD1/5/8 signaling have relatively high expression levels in normal bone marrows and MSCs. Among 16 antagonist genes, the dominantly expressed TGF-β superfamily ligands induced only NOG, GREM1 and GREM2 via different SMAD pathways in MSCs. These induced antagonist proteins further showed distinct antagonisms to the treated ligands and thus would make up complicated negative feedback networks in bone. We further identified TGF-β superfamily signaling is enriched in MSCs of primary osteoporosis. Enhanced expression of the genes mediating TGF-β-mediated SMAD3 signaling and the genes encoding TGF-β superfamily antagonists served as significant features to osteoporosis. CONCLUSION Our data for the first time unveiled the transcription landscape of all the genes that make up TGF-β superfamily signaling module in bone. The feedback mechanisms and regulatory network prediction of antagonists provided novel hints to treat osteoporosis. Video Abstract.
Collapse
Affiliation(s)
- Ying-Wen Wang
- grid.260539.b0000 0001 2059 7017Department of Life Sciences and Institute of Genome Sciences, National Yang Ming Chiao Tung University, 155 Li-Nong Street, Section 2, Beitou, Taipei, 112 Taiwan
| | - Wen-Yu Lin
- grid.260539.b0000 0001 2059 7017Department of Life Sciences and Institute of Genome Sciences, National Yang Ming Chiao Tung University, 155 Li-Nong Street, Section 2, Beitou, Taipei, 112 Taiwan
| | - Fang-Ju Wu
- grid.260539.b0000 0001 2059 7017Department of Life Sciences and Institute of Genome Sciences, National Yang Ming Chiao Tung University, 155 Li-Nong Street, Section 2, Beitou, Taipei, 112 Taiwan
| | - Ching-Wei Luo
- grid.260539.b0000 0001 2059 7017Department of Life Sciences and Institute of Genome Sciences, National Yang Ming Chiao Tung University, 155 Li-Nong Street, Section 2, Beitou, Taipei, 112 Taiwan
| |
Collapse
|
29
|
Jardillier R, Koca D, Chatelain F, Guyon L. Prognosis of lasso-like penalized Cox models with tumor profiling improves prediction over clinical data alone and benefits from bi-dimensional pre-screening. BMC Cancer 2022; 22:1045. [PMID: 36199072 PMCID: PMC9533541 DOI: 10.1186/s12885-022-10117-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Accepted: 09/14/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Prediction of patient survival from tumor molecular '-omics' data is a key step toward personalized medicine. Cox models performed on RNA profiling datasets are popular for clinical outcome predictions. But these models are applied in the context of "high dimension", as the number p of covariates (gene expressions) greatly exceeds the number n of patients and e of events. Thus, pre-screening together with penalization methods are widely used for dimensional reduction. METHODS In the present paper, (i) we benchmark the performance of the lasso penalization and three variants (i.e., ridge, elastic net, adaptive elastic net) on 16 cancers from TCGA after pre-screening, (ii) we propose a bi-dimensional pre-screening procedure based on both gene variability and p-values from single variable Cox models to predict survival, and (iii) we compare our results with iterative sure independence screening (ISIS). RESULTS First, we show that integration of mRNA-seq data with clinical data improves predictions over clinical data alone. Second, our bi-dimensional pre-screening procedure can only improve, in moderation, the C-index and/or the integrated Brier score, while excluding irrelevant genes for prediction. We demonstrate that the different penalization methods reached comparable prediction performances, with slight differences among datasets. Finally, we provide advice in the case of multi-omics data integration. CONCLUSIONS Tumor profiles convey more prognostic information than clinical variables such as stage for many cancer subtypes. Lasso and Ridge penalizations perform similarly than Elastic Net penalizations for Cox models in high-dimension. Pre-screening of the top 200 genes in term of single variable Cox model p-values is a practical way to reduce dimension, which may be particularly useful when integrating multi-omics.
Collapse
Affiliation(s)
- Rémy Jardillier
- IRIG, Biosanté U1292, Univ. Grenoble Alpes, Inserm, CEA, Grenoble, France
- GIPSA-lab, Institute of Engineering University Grenoble Alpes, Univ. Grenoble Alpes, CNRS, Grenoble INP, Grenoble, France
| | - Dzenis Koca
- IRIG, Biosanté U1292, Univ. Grenoble Alpes, Inserm, CEA, Grenoble, France
| | - Florent Chatelain
- GIPSA-lab, Institute of Engineering University Grenoble Alpes, Univ. Grenoble Alpes, CNRS, Grenoble INP, Grenoble, France
| | - Laurent Guyon
- IRIG, Biosanté U1292, Univ. Grenoble Alpes, Inserm, CEA, Grenoble, France
| |
Collapse
|
30
|
Nascimben M, Rimondini L, Corà D, Venturin M. Polygenic risk modeling of tumor stage and survival in bladder cancer. BioData Min 2022; 15:23. [PMID: 36175974 PMCID: PMC9523990 DOI: 10.1186/s13040-022-00306-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2021] [Accepted: 09/18/2022] [Indexed: 11/26/2022] Open
Abstract
Introduction Bladder cancer assessment with non-invasive gene expression signatures facilitates the detection of patients at risk and surveillance of their status, bypassing the discomforts given by cystoscopy. To achieve accurate cancer estimation, analysis pipelines for gene expression data (GED) may integrate a sequence of several machine learning and bio-statistical techniques to model complex characteristics of pathological patterns. Methods Numerical experiments tested the combination of GED preprocessing by discretization with tree ensemble embeddings and nonlinear dimensionality reductions to categorize oncological patients comprehensively. Modeling aimed to identify tumor stage and distinguish survival outcomes in two situations: complete and partial data embedding. This latter experimental condition simulates the addition of new patients to an existing model for rapid monitoring of disease progression. Machine learning procedures were employed to identify the most relevant genes involved in patient prognosis and test the performance of preprocessed GED compared to untransformed data in predicting patient conditions. Results Data embedding paired with dimensionality reduction produced prognostic maps with well-defined clusters of patients, suitable for medical decision support. A second experiment simulated the addition of new patients to an existing model (partial data embedding): Uniform Manifold Approximation and Projection (UMAP) methodology with uniform data discretization led to better outcomes than other analyzed pipelines. Further exploration of parameter space for UMAP and t-distributed stochastic neighbor embedding (t-SNE) underlined the importance of tuning a higher number of parameters for UMAP rather than t-SNE. Moreover, two different machine learning experiments identified a group of genes valuable for partitioning patients (gene relevance analysis) and showed the higher precision obtained by preprocessed data in predicting tumor outcomes for cancer stage and survival rate (six classes prediction). Conclusions The present investigation proposed new analysis pipelines for disease outcome modeling from bladder cancer-related biomarkers. Complete and partial data embedding experiments suggested that pipelines employing UMAP had a more accurate predictive ability, supporting the recent literature trends on this methodology. However, it was also found that several UMAP parameters influence experimental results, therefore deriving a recommendation for researchers to pay attention to this aspect of the UMAP technique. Machine learning procedures further demonstrated the effectiveness of the proposed preprocessing in predicting patients’ conditions and determined a sub-group of biomarkers significant for forecasting bladder cancer prognosis.
Collapse
Affiliation(s)
- Mauro Nascimben
- Department of Health Sciences, Università del Piemonte Orientale, Via Solaroli 17, 28100, Novara, Italy. .,Enginsoft SpA, Via Giambellino 7, 35129, Padova, Italy.
| | - Lia Rimondini
- Department of Health Sciences, Università del Piemonte Orientale, Via Solaroli 17, 28100, Novara, Italy
| | - Davide Corà
- Department of Health Sciences, Università del Piemonte Orientale, Via Solaroli 17, 28100, Novara, Italy.,Department of Translational Medicine, Università del Piemonte Orientale, Via Solaroli 17, 28100, Novara, Italy
| | | |
Collapse
|
31
|
Madhumita, Dwivedi A, Paul S. Recursive integration of synergised graph representations of multi-omics data for cancer subtypes identification. Sci Rep 2022; 12:15629. [PMID: 36115864 PMCID: PMC9482647 DOI: 10.1038/s41598-022-17585-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2021] [Accepted: 07/27/2022] [Indexed: 11/09/2022] Open
Abstract
AbstractCancer subtypes identification is one of the critical steps toward advancing personalized anti-cancerous therapies. Accumulation of a massive amount of multi-platform omics data measured across the same set of samples provides an opportunity to look into this deadly disease from several views simultaneously. Few integrative clustering approaches are developed to capture shared information from all the views to identify cancer subtypes. However, they have certain limitations. The challenge here is identifying the most relevant feature space from each omic view and systematically integrating them. Both the steps should lead toward a global clustering solution with biological significance. In this respect, a novel multi-omics clustering algorithm named RISynG (Recursive Integration of Synergised Graph-representations) is presented in this study. RISynG represents each omic view as two representation matrices that are Gramian and Laplacian. A parameterised combination function is defined to obtain a synergy matrix from these representation matrices. Then a recursive multi-kernel approach is applied to integrate the most relevant, shared, and complementary information captured via the respective synergy matrices. At last, clustering is applied to the integrated subspace. RISynG is benchmarked on five multi-omics cancer datasets taken from The Cancer Genome Atlas. The experimental results demonstrate RISynG’s efficiency over the other approaches in this domain.
Collapse
|
32
|
Hong HJ, Joung KH, Kim YK, Choi MJ, Kang SG, Kim JT, Kang YE, Chang JY, Moon JH, Jun S, Ro HJ, Lee Y, Kim H, Park JH, Kang BE, Jo Y, Choi H, Ryu D, Lee CH, Kim H, Park KS, Kim HJ, Shong M. Mitoribosome insufficiency in β cells is associated with type 2 diabetes-like islet failure. EXPERIMENTAL & MOLECULAR MEDICINE 2022; 54:932-945. [PMID: 35804190 PMCID: PMC9355985 DOI: 10.1038/s12276-022-00797-x] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/03/2021] [Revised: 02/22/2022] [Accepted: 03/14/2022] [Indexed: 12/04/2022]
Abstract
Genetic variations in mitoribosomal subunits and mitochondrial transcription factors are related to type 2 diabetes. However, the role of islet mitoribosomes in the development of type 2 diabetes has not been determined. We investigated the effects of the mitoribosomal gene on β-cell function and glucose homeostasis. Mitoribosomal gene expression was analyzed in datasets from the NCBI GEO website (GSE25724, GSE76894, and GSE76895) and the European Nucleotide Archive (ERP017126), which contain the transcriptomes of type 2 diabetic and nondiabetic organ donors. We found deregulation of most mitoribosomal genes in islets from individuals with type 2 diabetes, including partial downregulation of CRIF1. The phenotypes of haploinsufficiency in a single mitoribosomal gene were examined using β-cell-specific Crif1 (Mrpl59) heterozygous-deficient mice. Crif1beta+/− mice had normal glucose tolerance, but their islets showed a loss of first-phase glucose-stimulated insulin secretion. They also showed increased β-cell mass associated with higher expression of Reg family genes. However, Crif1beta+/− mice showed earlier islet failure in response to high-fat feeding, which was exacerbated by aging. Haploinsufficiency of a single mitoribosomal gene predisposes rodents to glucose intolerance, which resembles the early stages of type 2 diabetes in humans. Disruptions in the mitochondrial protein synthesis machinery give rise to metabolic disturbances that lay the foundation for type 2 diabetes. As physiological glucose levels rise, the energy-generating machinery of the mitochondria responds with increased activity, which stimulates insulin secretion. Many proteins responsible for mitochondrial metabolism are produced by ribosomes within this cellular organelle. Researchers led by Hyun Jin Kim and Minho Shong at Chungnam National University, Daejon, South Korea, have determined that mutations affecting a mitochondrial ribosomal protein called CRIF1 can lead to impaired insulin release. Mice with reduced CRIF1 were initially healthy, but as they aged, exhibited signs of impaired pancreatic function similar to those seen in patients with early-stage diabetes. This process was accelerated by consumption of a high-fat diet, and the researchers propose that this mechanism may be directly relevant to human disease.
Collapse
Affiliation(s)
- Hyun Jung Hong
- Research Center for Endocrine and Metabolic Diseases, Chungnam National University School of Medicine, Daejeon, 35015, Korea.,Department of Medical Science, Chungnam National University School of Medicine, Daejeon, 35015, Korea
| | - Kyong Hye Joung
- Research Center for Endocrine and Metabolic Diseases, Chungnam National University School of Medicine, Daejeon, 35015, Korea.,Department of Internal Medicine, Chungnam National University School of Medicine, Daejeon, 35015, Korea
| | - Yong Kyung Kim
- Research Center for Endocrine and Metabolic Diseases, Chungnam National University School of Medicine, Daejeon, 35015, Korea
| | - Min Jeong Choi
- Research Center for Endocrine and Metabolic Diseases, Chungnam National University School of Medicine, Daejeon, 35015, Korea
| | - Seul Gi Kang
- Research Center for Endocrine and Metabolic Diseases, Chungnam National University School of Medicine, Daejeon, 35015, Korea
| | - Jung Tae Kim
- Research Center for Endocrine and Metabolic Diseases, Chungnam National University School of Medicine, Daejeon, 35015, Korea.,Department of Medical Science, Chungnam National University School of Medicine, Daejeon, 35015, Korea
| | - Yea Eun Kang
- Research Center for Endocrine and Metabolic Diseases, Chungnam National University School of Medicine, Daejeon, 35015, Korea.,Department of Internal Medicine, Chungnam National University School of Medicine, Daejeon, 35015, Korea
| | - Joon Young Chang
- Research Center for Endocrine and Metabolic Diseases, Chungnam National University School of Medicine, Daejeon, 35015, Korea.,Department of Medical Science, Chungnam National University School of Medicine, Daejeon, 35015, Korea
| | - Joon Ho Moon
- Graduate School of Medical Science and Engineering, Korea Advanced Institute of Science and Technology, Daejeon, 34141, Korea
| | - Sangmi Jun
- Center for Research Equipment, Korea Basic Science Institute, Cheongju, 28119, Korea.,Convergent Research Center for Emerging Virus Infection, Korea Research Institute of Chemical Technology, Daejeon, 34114, Korea
| | - Hyun-Joo Ro
- Center for Research Equipment, Korea Basic Science Institute, Cheongju, 28119, Korea.,Convergent Research Center for Emerging Virus Infection, Korea Research Institute of Chemical Technology, Daejeon, 34114, Korea
| | - Yujeong Lee
- Center for Research Equipment, Korea Basic Science Institute, Cheongju, 28119, Korea.,Convergent Research Center for Emerging Virus Infection, Korea Research Institute of Chemical Technology, Daejeon, 34114, Korea
| | - Hyeongseok Kim
- Department of Biochemistry, Chungnam National University School of Medicine, Daejeon, 35015, Korea
| | - Jae-Hyung Park
- Department of Physiology, Keimyung University School of Medicine, Daegu, 704-200, Korea
| | - Baeki E Kang
- Department of Molecular Cell Biology, Sungkyunkwan University School of Medicine, Suwon, 16419, Korea
| | - Yunju Jo
- Department of Molecular Cell Biology, Sungkyunkwan University School of Medicine, Suwon, 16419, Korea
| | - Heejung Choi
- Department of Molecular Cell Biology, Sungkyunkwan University School of Medicine, Suwon, 16419, Korea
| | - Dongryeol Ryu
- Department of Molecular Cell Biology, Sungkyunkwan University School of Medicine, Suwon, 16419, Korea.,Biomedical Institute for Convergence at SKKU (BICS), Sungkyunkwan University, Suwon, 16419, Korea.,Samsung Biomedical Research Institute, Samsung Medical Center, Seoul, 06351, Korea
| | - Chul-Ho Lee
- Animal Model Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, 34141, Korea
| | - Hail Kim
- Graduate School of Medical Science and Engineering, Korea Advanced Institute of Science and Technology, Daejeon, 34141, Korea
| | - Kyu-Sang Park
- Department of Physiology, Yonsei University Wonju College of Medicine, Wonju, 26426, Korea
| | - Hyun Jin Kim
- Research Center for Endocrine and Metabolic Diseases, Chungnam National University School of Medicine, Daejeon, 35015, Korea. .,Department of Internal Medicine, Chungnam National University School of Medicine, Daejeon, 35015, Korea.
| | - Minho Shong
- Research Center for Endocrine and Metabolic Diseases, Chungnam National University School of Medicine, Daejeon, 35015, Korea. .,Department of Medical Science, Chungnam National University School of Medicine, Daejeon, 35015, Korea. .,Department of Internal Medicine, Chungnam National University School of Medicine, Daejeon, 35015, Korea.
| |
Collapse
|
33
|
Development and validation of an RNA-seq-based transcriptomic risk score for asthma. Sci Rep 2022; 12:8643. [PMID: 35606385 PMCID: PMC9126925 DOI: 10.1038/s41598-022-12199-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2022] [Accepted: 05/04/2022] [Indexed: 11/30/2022] Open
Abstract
Recent progress in RNA sequencing (RNA-seq) allows us to explore whole-genome gene expression profiles and to develop predictive model for disease risk. The objective of this study was to develop and validate an RNA-seq-based transcriptomic risk score (RSRS) for disease risk prediction that can simultaneously accommodate demographic information. We analyzed RNA-seq gene expression data from 441 asthmatic and 254 non-asthmatic samples. Logistic least absolute shrinkage and selection operator (Lasso) regression analysis in the training set identified 73 differentially expressed genes (DEG) to form a weighted RSRS that discriminated asthmatics from healthy subjects with area under the curve (AUC) of 0.80 in the testing set after adjustment for age and gender. The 73-gene RSRS was validated in three independent RNA-seq datasets and achieved AUCs of 0.70, 0.77 and 0.60, respectively. To explore their biological and molecular functions in asthma phenotype, we examined the 73 genes by enrichment pathway analysis and found that these genes were significantly (p < 0.0001) enriched for DNA replication, recombination, and repair, cell-to-cell signaling and interaction, and eumelanin biosynthesis and developmental disorder. Further in-silico analyses of the 73 genes using Connectivity map shows that drugs (mepacrine, dactolisib) and genetic perturbagens (PAK1, GSR, RBM15 and TNFRSF12A) were identified and could potentially be repurposed for treating asthma. These findings show the promise for RNA-seq risk scores to stratify and predict disease risk.
Collapse
|
34
|
Asrih M, Dusaulcy R, Gosmain Y, Philippe J, Somm E, Jornayvaz FR, Kang BE, Jo Y, Choi MJ, Yi HS, Ryu D, Gariani K. Growth differentiation factor-15 prevents glucotoxicity and connexin-36 downregulation in pancreatic beta-cells. Mol Cell Endocrinol 2022; 541:111503. [PMID: 34763008 DOI: 10.1016/j.mce.2021.111503] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/22/2021] [Revised: 10/26/2021] [Accepted: 10/29/2021] [Indexed: 01/11/2023]
Abstract
Pancreatic beta cell dysfunction is a hallmark of type 2 diabetes. Growth differentiation factor 15 (GDF15), which is an energy homeostasis regulator, has been shown to improve several metabolic parameters in the context of diabetes. However, its effects on pancreatic beta-cell remain to be identified. We, therefore, performed experiments using cell models and histological sectioning of wild-type and knock-out GDF15 mice to determine the effect of GDF15 on insulin secretion and cell viability. A bioinformatics analysis was performed to identify GDF15-correlated genes. GDF15 prevents glucotoxicity-mediated altered glucose-stimulated insulin secretion (GSIS) and connexin-36 downregulation. Inhibition of endogenous GDF15 reduced GSIS in cultured mouse beta-cells under standard conditions while it had no impact on GSIS in cells exposed to glucolipotoxicity, which is a diabetogenic condition. Furthermore, this inhibition exacerbated glucolipotoxicity-reduced cell survival. This suggests that endogenous GDF15 in beta-cell is required for cell survival but not GSIS in the context of glucolipotoxicity.
Collapse
Affiliation(s)
- Mohamed Asrih
- Service of Endocrinology, Diabetes, Nutrition and Patient Therapeutic Education, Geneva University Hospitals, Rue Gabrielle-Perret-Gentil 4, 1205, Geneva, Switzerland; University of Geneva Medical School, 1211, Geneva, Switzerland
| | - Rodolphe Dusaulcy
- Service of Endocrinology, Diabetes, Nutrition and Patient Therapeutic Education, Geneva University Hospitals, Rue Gabrielle-Perret-Gentil 4, 1205, Geneva, Switzerland; University of Geneva Medical School, 1211, Geneva, Switzerland
| | - Yvan Gosmain
- Service of Endocrinology, Diabetes, Nutrition and Patient Therapeutic Education, Geneva University Hospitals, Rue Gabrielle-Perret-Gentil 4, 1205, Geneva, Switzerland; University of Geneva Medical School, 1211, Geneva, Switzerland
| | - Jacques Philippe
- Service of Endocrinology, Diabetes, Nutrition and Patient Therapeutic Education, Geneva University Hospitals, Rue Gabrielle-Perret-Gentil 4, 1205, Geneva, Switzerland; University of Geneva Medical School, 1211, Geneva, Switzerland
| | - Emmanuel Somm
- Service of Endocrinology, Diabetes, Nutrition and Patient Therapeutic Education, Geneva University Hospitals, Rue Gabrielle-Perret-Gentil 4, 1205, Geneva, Switzerland; University of Geneva Medical School, 1211, Geneva, Switzerland
| | - François R Jornayvaz
- Service of Endocrinology, Diabetes, Nutrition and Patient Therapeutic Education, Geneva University Hospitals, Rue Gabrielle-Perret-Gentil 4, 1205, Geneva, Switzerland; University of Geneva Medical School, 1211, Geneva, Switzerland
| | - Baeki E Kang
- Department of Molecular Cell Biology, Sungkyunkwan University School of Medicine, 16419, Suwon, Republic of Korea
| | - Yunju Jo
- Department of Molecular Cell Biology, Sungkyunkwan University School of Medicine, 16419, Suwon, Republic of Korea
| | - Min Jeong Choi
- Research Center for Endocrine and Metabolic Diseases, Chungnam National University Hospital, Chungnam National University School of Medicine, 35015, Daejeon, Republic of Korea; Department of Medical Science, Chungnam National University School of Medicine, 35015, Daejeon, Republic of Korea
| | - Hyon-Seung Yi
- Research Center for Endocrine and Metabolic Diseases, Chungnam National University Hospital, Chungnam National University School of Medicine, 35015, Daejeon, Republic of Korea; Department of Medical Science, Chungnam National University School of Medicine, 35015, Daejeon, Republic of Korea
| | - Dongryeol Ryu
- Department of Molecular Cell Biology, Sungkyunkwan University School of Medicine, 16419, Suwon, Republic of Korea; Biomedical Institute for Convergence at SKKU (BICS), Sungkyunkwan University, 16419, Suwon, Republic of Korea; Samsung Biomedical Research Institute, Samsung Medical Center, 06351, Seoul, Republic of Korea
| | - Karim Gariani
- Service of Endocrinology, Diabetes, Nutrition and Patient Therapeutic Education, Geneva University Hospitals, Rue Gabrielle-Perret-Gentil 4, 1205, Geneva, Switzerland; University of Geneva Medical School, 1211, Geneva, Switzerland.
| |
Collapse
|
35
|
Kaur N, Oskotsky B, Butte AJ, Hu Z. Systematic identification of ACE2 expression modulators reveals cardiomyopathy as a risk factor for mortality in COVID-19 patients. Genome Biol 2022; 23:15. [PMID: 35012625 PMCID: PMC8743438 DOI: 10.1186/s13059-021-02589-4] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Accepted: 12/23/2021] [Indexed: 01/08/2023] Open
Abstract
BACKGROUND Angiotensin-converting enzyme 2 (ACE2) is the cell-entry receptor for SARS-CoV-2. It plays critical roles in both the transmission and the pathogenesis of COVID-19. Comprehensive profiling of ACE2 expression patterns could reveal risk factors of severe COVID-19 illness. While the expression of ACE2 in healthy human tissues has been well characterized, it is not known which diseases and drugs might be associated with ACE2 expression. RESULTS We develop GENEVA (GENe Expression Variance Analysis), a semi-automated framework for exploring massive amounts of RNA-seq datasets. We apply GENEVA to 286,650 publicly available RNA-seq samples to identify any previously studied experimental conditions that could be directly or indirectly associated with ACE2 expression. We identify multiple drugs, genetic perturbations, and diseases that are associated with the expression of ACE2, including cardiomyopathy, HNF1A overexpression, and drug treatments with RAD140 and itraconazole. Our joint analysis of seven datasets confirms ACE2 upregulation in all cardiomyopathy categories. Using electronic health records data from 3936 COVID-19 patients, we demonstrate that patients with pre-existing cardiomyopathy have an increased mortality risk than age-matched patients with other cardiovascular conditions. GENEVA is applicable to any genes of interest and is freely accessible at http://genevatool.org . CONCLUSIONS This study identifies multiple diseases and drugs that are associated with the expression of ACE2. The effect of these conditions should be carefully studied in COVID-19 patients. In particular, our analysis identifies cardiomyopathy patients as a high-risk group, with increased ACE2 expression in the heart and increased mortality after SARS-COV-2 infection.
Collapse
Affiliation(s)
- Navchetan Kaur
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA
- Department of Pediatrics, University of California, San Francisco, CA, USA
| | - Boris Oskotsky
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA
- Department of Pediatrics, University of California, San Francisco, CA, USA
| | - Atul J Butte
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA
- Department of Pediatrics, University of California, San Francisco, CA, USA
| | - Zicheng Hu
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA.
- Department of Microbiology and Immunology, University of California, San Francisco, San Francisco, CA, USA.
| |
Collapse
|
36
|
Mourragui SMC, Loog M, Vis DJ, Moore K, Manjon AG, van de Wiel MA, Reinders MJT, Wessels LFA. Predicting patient response with models trained on cell lines and patient-derived xenografts by nonlinear transfer learning. Proc Natl Acad Sci U S A 2021; 118:e2106682118. [PMID: 34873056 PMCID: PMC8670522 DOI: 10.1073/pnas.2106682118] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/18/2021] [Indexed: 12/13/2022] Open
Abstract
Preclinical models have been the workhorse of cancer research, producing massive amounts of drug response data. Unfortunately, translating response biomarkers derived from these datasets to human tumors has proven to be particularly challenging. To address this challenge, we developed TRANSACT, a computational framework that builds a consensus space to capture biological processes common to preclinical models and human tumors and exploits this space to construct drug response predictors that robustly transfer from preclinical models to human tumors. TRANSACT performs favorably compared to four competing approaches, including two deep learning approaches, on a set of 23 drug prediction challenges on The Cancer Genome Atlas and 226 metastatic tumors from the Hartwig Medical Foundation. We demonstrate that response predictions deliver a robust performance for a number of therapies of high clinical importance: platinum-based chemotherapies, gemcitabine, and paclitaxel. In contrast to other approaches, we demonstrate the interpretability of the TRANSACT predictors by correctly identifying known biomarkers of targeted therapies, and we propose potential mechanisms that mediate the resistance to two chemotherapeutic agents.
Collapse
Affiliation(s)
- Soufiane M C Mourragui
- Division of Molecular Carcinogenesis, Oncode Institute, The Netherlands Cancer Institute, 1066 CX Amsterdam, The Netherlands
- Department of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, 2628 XE Delft, The Netherlands
| | - Marco Loog
- Department of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, 2628 XE Delft, The Netherlands
- Department of Computer Science, University of Copenhagen, 2100 Copenhagen, Denmark
| | - Daniel J Vis
- Division of Molecular Carcinogenesis, Oncode Institute, The Netherlands Cancer Institute, 1066 CX Amsterdam, The Netherlands
| | - Kat Moore
- Division of Molecular Carcinogenesis, Oncode Institute, The Netherlands Cancer Institute, 1066 CX Amsterdam, The Netherlands
| | - Anna G Manjon
- Division of Cell Biology, Oncode Institute, The Netherlands Cancer Institute, 1066 CX Amsterdam, The Netherlands
| | - Mark A van de Wiel
- Epidemiology and Biostatistics, Amsterdam University Medical Center, 1105 AZ Amsterdam, The Netherlands
- Medical Research Council Biostatistics Unit, Cambridge University, Cambridge CB2 0SR, United Kingdom
| | - Marcel J T Reinders
- Department of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, 2628 XE Delft, The Netherlands;
- Leiden Computational Biology Center, Leiden University Medical Center, 2333 ZC Leiden, The Netherlands
| | - Lodewyk F A Wessels
- Division of Molecular Carcinogenesis, Oncode Institute, The Netherlands Cancer Institute, 1066 CX Amsterdam, The Netherlands;
- Department of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, 2628 XE Delft, The Netherlands
| |
Collapse
|
37
|
Weaver DT, Pishas KI, Williamson D, Scarborough J, Lessnick SL, Dhawan A, Scott JG. Network potential identifies therapeutic miRNA cocktails in Ewing sarcoma. PLoS Comput Biol 2021; 17:e1008755. [PMID: 34662337 PMCID: PMC8601628 DOI: 10.1371/journal.pcbi.1008755] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2021] [Revised: 11/18/2021] [Accepted: 09/20/2021] [Indexed: 12/16/2022] Open
Abstract
MicroRNA (miRNA)-based therapies are an emerging class of targeted therapeutics with many potential applications. Ewing Sarcoma patients could benefit dramatically from personalized miRNA therapy due to inter-patient heterogeneity and a lack of druggable (to this point) targets. However, because of the broad effects miRNAs may have on different cells and tissues, trials of miRNA therapies have struggled due to severe toxicity and unanticipated immune response. In order to overcome this hurdle, a network science-based approach is well-equipped to evaluate and identify miRNA candidates and combinations of candidates for the repression of key oncogenic targets while avoiding repression of essential housekeeping genes. We first characterized 6 Ewing sarcoma cell lines using mRNA sequencing. We then estimated a measure of tumor state, which we term network potential, based on both the mRNA gene expression and the underlying protein-protein interaction network in the tumor. Next, we ranked mRNA targets based on their contribution to network potential. We then identified miRNAs and combinations of miRNAs that preferentially act to repress mRNA targets with the greatest influence on network potential. Our analysis identified TRIM25, APP, ELAV1, RNF4, and HNRNPL as ideal mRNA targets for Ewing sarcoma therapy. Using predicted miRNA-mRNA target mappings, we identified miR-3613-3p, let-7a-3p, miR-300, miR-424-5p, and let-7b-3p as candidate optimal miRNAs for preferential repression of these targets. Ultimately, our work, as exemplified in the case of Ewing sarcoma, describes a novel pipeline by which personalized miRNA cocktails can be designed to maximally perturb gene networks contributing to cancer progression.
Collapse
Affiliation(s)
- Davis T. Weaver
- Case Western Reserve University School of Medicine, Cleveland, Ohio, United States of America
- Translational Hematology Oncology Research, Cleveland Clinic, Cleveland, Ohio, United States of America
| | | | - Drew Williamson
- Department of Pathology, Brigham & Women’s Hospital, Boston, Massachusetts, United States of America
| | - Jessica Scarborough
- Case Western Reserve University School of Medicine, Cleveland, Ohio, United States of America
- Translational Hematology Oncology Research, Cleveland Clinic, Cleveland, Ohio, United States of America
| | | | - Andrew Dhawan
- Translational Hematology Oncology Research, Cleveland Clinic, Cleveland, Ohio, United States of America
- Division of Neurology, Cleveland Clinic, Cleveland, Ohio, United States of America
- * E-mail: (AD); (JGS)
| | - Jacob G. Scott
- Case Western Reserve University School of Medicine, Cleveland, Ohio, United States of America
- Translational Hematology Oncology Research, Cleveland Clinic, Cleveland, Ohio, United States of America
- Department of Physics, Case Western Reserve University, Cleveland, Ohio, United States of America
- * E-mail: (AD); (JGS)
| |
Collapse
|
38
|
El-Saadani S, Saleh M, Ibrahim SA. Quantifying non-communicable diseases' burden in Egypt using State-Space model. PLoS One 2021; 16:e0245642. [PMID: 34375334 PMCID: PMC8354445 DOI: 10.1371/journal.pone.0245642] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2021] [Accepted: 07/12/2021] [Indexed: 11/26/2022] Open
Abstract
The study aimed to model and quantify the health burden induced by four non-communicable diseases (NCDs) in Egypt, the first to be conducted in the context of a less developing county. The study used the State-Space model and adopted two Bayesian methods: Particle Filter and Particle Independent Metropolis-Hastings to model and estimate the NCDs’ health burden trajectories. We drew on time-series data of the International Health Metric Evaluation, the Central Agency for Public Mobilization and Statistics (CAPMAS) Annual Bulletin of Health Services Statistics, the World Bank, and WHO data. Both Bayesian methods showed that the burden trajectories are on the rise. Most of the findings agreed with our assumptions and are in line with the literature. Previous year burden strongly predicts the burden of the current year. High prevalence of the risk factors, disease prevalence, and the disease’s severity level all increase illness burden. Years of life lost due to death has high loadings in most of the diseases. Contrary to the study assumption, results found a negative relationship between disease burden and health services utilization which can be attributed to the lack of full health insurance coverage and the pattern of health care seeking behavior in Egypt. Our study highlights that Particle Independent Metropolis-Hastings is sufficient in estimating the parameters of the study model, in the case of time-constant parameters. The study recommends using state Space models with Bayesian estimation approaches with time-series data in public health and epidemiology research.
Collapse
Affiliation(s)
- Somaya El-Saadani
- Department of Biostatistics and Demography,Faculty of Graduate Studies for Statistical Research, Cairo University, Cairo, Egypt
- * E-mail:
| | - Mohamed Saleh
- Faculty of Computers and Artificial Intelligence, Cairo University, Cairo, Egypt
| | - Sarah A. Ibrahim
- Department of Biostatistics and Demography,Faculty of Graduate Studies for Statistical Research, Cairo University, Cairo, Egypt
| |
Collapse
|
39
|
Robust integration of multiple single-cell RNA sequencing datasets using a single reference space. Nat Biotechnol 2021; 39:877-884. [PMID: 33767393 PMCID: PMC8456427 DOI: 10.1038/s41587-021-00859-x] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2019] [Accepted: 02/16/2021] [Indexed: 01/31/2023]
Abstract
In many biological applications of single-cell RNA sequencing (scRNA-seq), an integrated analysis of data from multiple batches or studies is necessary. Current methods typically achieve integration using shared cell types or covariance correlation between datasets, which can distort biological signals. Here we introduce an algorithm that uses the gene eigenvectors from a reference dataset to establish a global frame for integration. Using simulated and real datasets, we demonstrate that this approach, called Reference Principal Component Integration (RPCI), consistently outperforms other methods by multiple metrics, with clear advantages in preserving genuine cross-sample gene expression differences in matching cell types, such as those present in cells at distinct developmental stages or in perturbated versus control studies. Moreover, RPCI maintains this robust performance when multiple datasets are integrated. Finally, we applied RPCI to scRNA-seq data for mouse gut endoderm development and revealed temporal emergence of genetic programs helping establish the anterior-posterior axis in visceral endoderm.
Collapse
|
40
|
Faillace GR, Caruso PB, Timmers LFSM, Favero D, Guzman FL, Rechenmacher C, de Oliveira-Busatto LA, de Souza ON, Bredemeier C, Bodanese-Zanettini MH. Molecular Characterisation of Soybean Osmotins and Their Involvement in Drought Stress Response. Front Genet 2021; 12:632685. [PMID: 34249077 PMCID: PMC8267864 DOI: 10.3389/fgene.2021.632685] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2020] [Accepted: 04/09/2021] [Indexed: 11/13/2022] Open
Abstract
Osmotins are multifunctional proteins belonging to the thaumatin-like family related to plant stress responses. To better understand the functions of soybean osmotins in drought stress response, the current study presents the characterisation of four previously described proteins and a novel putative soybean osmotin (GmOLPa-like). Gene and protein structure as well as gene expression analyses were conducted on different tissues and developmental stages of two soybean cultivars with varying dehydration sensitivities (BR16 and EMB48 are highly and slightly sensitive, respectively). The analysed osmotin sequences share the conserved amino acid signature and 3D structure of the thaumatin-like family. Some differences were observed in the conserved regions of protein sequences and in the electrostatic surface potential. P21-like present the most similar electrostatic potential to osmotins previously characterised as promoters of drought tolerance in Nicotiana tabacum and Solanum nigrum. Gene expression analysis indicated that soybean osmotins were differentially expressed in different organs (leaves and roots), developmental stages (R1 and V3), and cultivars in response to dehydration. In addition, under dehydration conditions, the highest level of gene expression was detected for GmOLPa-like and P21-like osmotins in the leaves and roots, respectively, of the less drought sensitive cultivar. Altogether, the results suggest an involvement of these genes in drought stress tolerance.
Collapse
Affiliation(s)
- Giulia Ramos Faillace
- Programa de Pós-Graduação em Genética e Biologia Molecular and Instituto Nacional de Ciência e Tecnologia: Biotec Seca-Pragas, Departamento de Genética, Instituto de Biociências, Universidade Federal do Rio Grande do Sul (UFRGS), Porto Alegre, Brazil
| | - Paula Bacaicoa Caruso
- Laboratório de Bioinformática, Modelagem e Simulação de Biossistemas (LABIO), Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS), Porto Alegre, Brazil
| | - Luis Fernando Saraiva Macedo Timmers
- Laboratório de Bioinformática, Modelagem e Simulação de Biossistemas (LABIO), Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS), Porto Alegre, Brazil.,Programa de Pós-Graduação em Biologia Celular e Molecular, Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS), Porto Alegre, Brazil
| | - Débora Favero
- Programa de Pós-Graduação em Fitotecnia, Departamento de Plantas de Lavoura, Faculdade de Agronomia, Universidade Federal do Rio Grande do Sul (UFRGS), Porto Alegre, Brazil
| | - Frank Lino Guzman
- Programa de Pós-Graduação em Biologia Celular e Molecular, Centro de Biotecnologia (CBiot), Universidade Federal do Rio Grande do Sul (UFRGS), Porto Alegre, Brazil
| | - Ciliana Rechenmacher
- Programa de Pós-Graduação em Genética e Biologia Molecular and Instituto Nacional de Ciência e Tecnologia: Biotec Seca-Pragas, Departamento de Genética, Instituto de Biociências, Universidade Federal do Rio Grande do Sul (UFRGS), Porto Alegre, Brazil
| | - Luisa Abruzzi de Oliveira-Busatto
- Programa de Pós-Graduação em Genética e Biologia Molecular and Instituto Nacional de Ciência e Tecnologia: Biotec Seca-Pragas, Departamento de Genética, Instituto de Biociências, Universidade Federal do Rio Grande do Sul (UFRGS), Porto Alegre, Brazil
| | - Osmar Norberto de Souza
- Laboratório de Bioinformática, Modelagem e Simulação de Biossistemas (LABIO), Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS), Porto Alegre, Brazil.,Programa de Pós-Graduação em Biologia Celular e Molecular, Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS), Porto Alegre, Brazil
| | - Christian Bredemeier
- Programa de Pós-Graduação em Fitotecnia, Departamento de Plantas de Lavoura, Faculdade de Agronomia, Universidade Federal do Rio Grande do Sul (UFRGS), Porto Alegre, Brazil
| | - Maria Helena Bodanese-Zanettini
- Programa de Pós-Graduação em Genética e Biologia Molecular and Instituto Nacional de Ciência e Tecnologia: Biotec Seca-Pragas, Departamento de Genética, Instituto de Biociências, Universidade Federal do Rio Grande do Sul (UFRGS), Porto Alegre, Brazil
| |
Collapse
|
41
|
Simonetti G, Angeli D, Petracci E, Fonzi E, Vedovato S, Sperotto A, Padella A, Ghetti M, Ferrari A, Robustelli V, Di Liddo R, Conconi MT, Papayannidis C, Cerchione C, Rondoni M, Astolfi A, Ottaviani E, Martinelli G, Gottardi M. Adrenomedullin Expression Characterizes Leukemia Stem Cells and Associates With an Inflammatory Signature in Acute Myeloid Leukemia. Front Oncol 2021; 11:684396. [PMID: 34150648 PMCID: PMC8208888 DOI: 10.3389/fonc.2021.684396] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2021] [Accepted: 04/23/2021] [Indexed: 12/13/2022] Open
Abstract
Adrenomedullin (ADM) is a hypotensive and vasodilator peptide belonging to the calcitonin gene-related peptide family. It is secreted in vitro by endothelial cells and vascular smooth muscle cells, and is significantly upregulated by a number of stimuli. Moreover, ADM participates in the regulation of hematopoietic compartment, solid tumors and leukemias, such as acute myeloid leukemia (AML). To better characterize ADM involvement in AML pathogenesis, we investigated its expression during human hematopoiesis and in leukemic subsets, based on a morphological, cytogenetic and molecular characterization and in T cells from AML patients. In hematopoietic stem/progenitor cells and T lymphocytes from healthy subjects, ADM transcript was barely detectable. It was expressed at low levels by megakaryocytes and erythroblasts, while higher levels were measured in neutrophils, monocytes and plasma cells. Moreover, cells populating the hematopoietic niche, including mesenchymal stem cells, showed to express ADM. ADM was overexpressed in AML cells versus normal CD34+ cells and in the subset of leukemia compared with hematopoietic stem cells. In parallel, we detected a significant variation of ADM expression among cytogenetic subgroups, measuring the highest levels in inv(16)/t(16;16) or complex karyotype AML. According to the mutational status of AML-related genes, the analysis showed a lower expression of ADM in FLT3-ITD, NPM1-mutated AML and FLT3-ITD/NPM1-mutated cases compared with wild-type ones. Moreover, ADM expression had a negative impact on overall survival within the favorable risk class, while showing a potential positive impact within the subgroup receiving a not-intensive treatment. The expression of 135 genes involved in leukemogenesis, regulation of cell proliferation, ferroptosis, protection from apoptosis, HIF-1α signaling, JAK-STAT pathway, immune and inflammatory responses was correlated with ADM levels in the bone marrow cells of at least two AML cohorts. Moreover, ADM was upregulated in CD4+ T and CD8+ T cells from AML patients compared with healthy controls and some ADM co-expressed genes participate in a signature of immune tolerance that characterizes CD4+ T cells from leukemic patients. Overall, our study shows that ADM expression in AML associates with a stem cell phenotype, inflammatory signatures and genes related to immunosuppression, all factors that contribute to therapy resistance and disease relapse.
Collapse
Affiliation(s)
- Giorgia Simonetti
- Biosciences Laboratory, IRCCS Istituto Romagnolo per lo Studio dei Tumori (IRST) “Dino Amadori”, Meldola, Italy
| | - Davide Angeli
- Unit of Biostatistics and Clinical Trials, IRCCS Istituto Romagnolo per lo Studio dei Tumori (IRST) “Dino Amadori”, Meldola, Italy
| | - Elisabetta Petracci
- Unit of Biostatistics and Clinical Trials, IRCCS Istituto Romagnolo per lo Studio dei Tumori (IRST) “Dino Amadori”, Meldola, Italy
| | - Eugenio Fonzi
- Unit of Biostatistics and Clinical Trials, IRCCS Istituto Romagnolo per lo Studio dei Tumori (IRST) “Dino Amadori”, Meldola, Italy
| | - Susanna Vedovato
- Department of Clinical and Experimental Medicine, University of Padova, Padua, Italy
| | - Alessandra Sperotto
- Hematology and Transplant Center Unit, Dipartimento di Area Medica (DAME), Udine University Hospital, Udine, Italy
| | - Antonella Padella
- Biosciences Laboratory, IRCCS Istituto Romagnolo per lo Studio dei Tumori (IRST) “Dino Amadori”, Meldola, Italy
| | - Martina Ghetti
- Biosciences Laboratory, IRCCS Istituto Romagnolo per lo Studio dei Tumori (IRST) “Dino Amadori”, Meldola, Italy
| | - Anna Ferrari
- Biosciences Laboratory, IRCCS Istituto Romagnolo per lo Studio dei Tumori (IRST) “Dino Amadori”, Meldola, Italy
| | - Valentina Robustelli
- IRCCS Azienda Ospedaliero-Universitaria di Bologna, Istituto di Ematologia “Seràgnoli”, Bologna, Italy
- Dipartimento di Medicina Specialistica, Diagnostica e Sperimentale, Università di Bologna, Bologna, Italy
| | - Rosa Di Liddo
- Department of Pharmaceutical and Pharmacological Sciences, University of Padova, Padua, Italy
| | - Maria Teresa Conconi
- Department of Pharmaceutical and Pharmacological Sciences, University of Padova, Padua, Italy
| | - Cristina Papayannidis
- IRCCS Azienda Ospedaliero-Universitaria di Bologna, Istituto di Ematologia “Seràgnoli”, Bologna, Italy
| | - Claudio Cerchione
- Hematology Unit, IRCCS Istituto Romagnolo per lo Studio dei Tumori (IRST) “Dino Amadori”, Meldola, Italy
| | - Michela Rondoni
- Hematology Unit & Romagna Transplant Network, Ravenna Hospital, Ravenna, Italy
| | - Annalisa Astolfi
- “Giorgio Prodi” Cancer Research Center, University of Bologna, Bologna, Italy
- Department of Morphology, Surgery and Experimental Medicine, University of Ferrara, Ferrara, Italy
| | - Emanuela Ottaviani
- IRCCS Azienda Ospedaliero-Universitaria di Bologna, Istituto di Ematologia “Seràgnoli”, Bologna, Italy
| | - Giovanni Martinelli
- Scientific Directorate, IRCCS Istituto Romagnolo per lo Studio dei Tumori (IRST) “Dino Amadori”, Meldola, Italy
| | - Michele Gottardi
- Onco Hematology, Department of Oncology, Veneto Institute of Oncology IOV, IRCCS, Padua, Italy
| |
Collapse
|
42
|
Zhang R, Ren Z, Celedón JC, Chen W. Inference of large modified Poisson-type graphical models: Application to RNA-seq data in childhood atopic asthma studies. Ann Appl Stat 2021. [DOI: 10.1214/20-aoas1413] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
- Rong Zhang
- Department of Statistics, University of Pittsburgh
| | - Zhao Ren
- Department of Statistics, University of Pittsburgh
| | - Juan C. Celedón
- Department of Pediatrics, UPMC Children’s Hospital of Pittsburgh, University of Pittsburgh
| | - Wei Chen
- Department of Pediatrics, UPMC Children’s Hospital of Pittsburgh, University of Pittsburgh
| |
Collapse
|
43
|
Ascari G, Rendtorff ND, De Bruyne M, De Zaeytijd J, Van Lint M, Bauwens M, Van Heetvelde M, Arno G, Jacob J, Creytens D, Van Dorpe J, Van Laethem T, Rosseel T, De Pooter T, De Rijk P, De Coster W, Menten B, Rey AD, Strazisar M, Bertelsen M, Tranebjaerg L, De Baere E. Long-Read Sequencing to Unravel Complex Structural Variants of CEP78 Leading to Cone-Rod Dystrophy and Hearing Loss. Front Cell Dev Biol 2021; 9:664317. [PMID: 33968938 PMCID: PMC8097100 DOI: 10.3389/fcell.2021.664317] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2021] [Accepted: 03/08/2021] [Indexed: 11/13/2022] Open
Abstract
Inactivating variants as well as a missense variant in the centrosomal CEP78 gene have been identified in autosomal recessive cone-rod dystrophy with hearing loss (CRDHL), a rare syndromic inherited retinal disease distinct from Usher syndrome. Apart from this, a complex structural variant (SV) implicating CEP78 has been reported in CRDHL. Here we aimed to expand the genetic architecture of typical CRDHL by the identification of complex SVs of the CEP78 region and characterization of their underlying mechanisms. Approaches used for the identification of the SVs are shallow whole-genome sequencing (sWGS) combined with quantitative polymerase chain reaction (PCR) and long-range PCR, or ExomeDepth analysis on whole-exome sequencing (WES) data. Targeted or whole-genome nanopore long-read sequencing (LRS) was used to delineate breakpoint junctions at the nucleotide level. For all SVs cases, the effect of the SVs on CEP78 expression was assessed using quantitative PCR on patient-derived RNA. Apart from two novel canonical CEP78 splice variants and a frameshifting single-nucleotide variant (SNV), two SVs affecting CEP78 were identified in three unrelated individuals with CRDHL: a heterozygous total gene deletion of 235 kb and a partial gene deletion of 15 kb in a heterozygous and homozygous state, respectively. Assessment of the molecular consequences of the SVs on patient's materials displayed a loss-of-function effect. Delineation and characterization of the 15-kb deletion using targeted LRS revealed the previously described complex CEP78 SV, suggestive of a recurrent genomic rearrangement. A founder haplotype was demonstrated for the latter SV in cases of Belgian and British origin, respectively. The novel 235-kb deletion was delineated using whole-genome LRS. Breakpoint analysis showed microhomology and pointed to a replication-based underlying mechanism. Moreover, data mining of bulk and single-cell human and mouse transcriptional datasets, together with CEP78 immunostaining on human retina, linked the CEP78 expression domain with its phenotypic manifestations. Overall, this study supports that the CEP78 locus is prone to distinct SVs and that SV analysis should be considered in a genetic workup of CRDHL. Finally, it demonstrated the power of sWGS and both targeted and whole-genome LRS in identifying and characterizing complex SVs in patients with ocular diseases.
Collapse
Affiliation(s)
- Giulia Ascari
- Center for Medical Genetics Ghent, Ghent University Hospital, Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| | - Nanna D Rendtorff
- The Kennedy Center, Department of Clinical Genetics, Rigshospitalet, Copenhagen University Hospital, Copenhagen, Denmark
| | - Marieke De Bruyne
- Center for Medical Genetics Ghent, Ghent University Hospital, Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| | - Julie De Zaeytijd
- Department of Ophthalmology, Ghent University Hospital, Ghent, Belgium
| | - Michel Van Lint
- Department of Ophthalmology, Antwerp University Hospital, Antwerp, Belgium
| | - Miriam Bauwens
- Center for Medical Genetics Ghent, Ghent University Hospital, Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| | - Mattias Van Heetvelde
- Center for Medical Genetics Ghent, Ghent University Hospital, Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| | - Gavin Arno
- Great Ormond Street Hospital, London, United Kingdom.,Moorfields Eye Hospital, London, United Kingdom.,UCL Institute of Ophthalmology, London, United Kingdom
| | - Julie Jacob
- Department of Ophthalmology, University Hospitals Leuven, Leuven, Belgium
| | - David Creytens
- Department of Pathology, Ghent University Hospital, Ghent, Belgium.,Department of Diagnostic Sciences, Ghent University, Ghent, Belgium
| | - Jo Van Dorpe
- Department of Pathology, Ghent University Hospital, Ghent, Belgium.,Department of Diagnostic Sciences, Ghent University, Ghent, Belgium
| | - Thalia Van Laethem
- Center for Medical Genetics Ghent, Ghent University Hospital, Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| | - Toon Rosseel
- Center for Medical Genetics Ghent, Ghent University Hospital, Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| | - Tim De Pooter
- Neuromics Support Facility, VIB Center for Molecular Neurology, VIB, Antwerp, Belgium.,Neuromics Support Facility, Department of Biomedical Sciences, University of Antwerp, Antwerp, Belgium
| | - Peter De Rijk
- Neuromics Support Facility, VIB Center for Molecular Neurology, VIB, Antwerp, Belgium.,Neuromics Support Facility, Department of Biomedical Sciences, University of Antwerp, Antwerp, Belgium
| | - Wouter De Coster
- Applied and Translational Neurogenomics Group, VIB Center for Molecular Neurology, VIB, Antwerp, Belgium.,Applied and Translational Neurogenomics Group, Department of Biomedical Sciences, University of Antwerp, Antwerp, Belgium
| | - Björn Menten
- Center for Medical Genetics Ghent, Ghent University Hospital, Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| | - Alfredo Dueñas Rey
- Center for Medical Genetics Ghent, Ghent University Hospital, Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| | - Mojca Strazisar
- Neuromics Support Facility, VIB Center for Molecular Neurology, VIB, Antwerp, Belgium.,Neuromics Support Facility, Department of Biomedical Sciences, University of Antwerp, Antwerp, Belgium
| | - Mette Bertelsen
- The Kennedy Center, Department of Clinical Genetics, Rigshospitalet, Copenhagen University Hospital, Copenhagen, Denmark.,Department of Ophthalmology, Rigshospitalet-Glostrup, University of Copenhagen, Glostrup, Denmark
| | - Lisbeth Tranebjaerg
- The Kennedy Center, Department of Clinical Genetics, Rigshospitalet, Copenhagen University Hospital, Copenhagen, Denmark.,Institute of Clinical Medicine, University of Copenhagen, Copenhagen, Denmark
| | - Elfride De Baere
- Center for Medical Genetics Ghent, Ghent University Hospital, Ghent, Belgium.,Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
| |
Collapse
|
44
|
Lim DK, Rashid NU, Ibrahim JG. MODEL-BASED FEATURE SELECTION AND CLUSTERING OF RNA-SEQ DATA FOR UNSUPERVISED SUBTYPE DISCOVERY. Ann Appl Stat 2021; 15:481-508. [PMID: 34457104 PMCID: PMC8386505 DOI: 10.1214/20-aoas1407] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Clustering is a form of unsupervised learning that aims to uncover latent groups within data based on similarity across a set of features. A common application of this in biomedical research is in delineating novel cancer subtypes from patient gene expression data, given a set of informative genes. However, it is typically unknown a priori what genes may be informative in discriminating between clusters, and what the optimal number of clusters are. Few methods exist for performing unsupervised clustering of RNA-seq samples, and none currently adjust for between-sample global normalization factors, select cluster-discriminatory genes, or account for potential confounding variables during clustering. To address these issues, we propose the Feature Selection and Clustering of RNA-seq (FSCseq): a model-based clustering algorithm that utilizes a finite mixture of regression (FMR) model and the quadratic penalty method with a Smoothly-Clipped Absolute Deviation (SCAD) penalty. The maximization is done by a penalized Classification EM algorithm, allowing us to include normalization factors and confounders in our modeling framework. Given the fitted model, our framework allows for subtype prediction in new patients via posterior probabilities of cluster membership, even in the presence of batch effects. Based on simulations and real data analysis, we show the advantages of our method relative to competing approaches.
Collapse
Affiliation(s)
- David K Lim
- University of North Carolina at Chapel Hill, NC, USA
| | - Naim U Rashid
- University of North Carolina at Chapel Hill, NC, USA
| | | |
Collapse
|
45
|
Hunt GJ, Gagnon-Bartsch JA. The role of scale in the estimation of cell-type proportions. Ann Appl Stat 2021. [DOI: 10.1214/20-aoas1395] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
46
|
Gill EE, Smith ML, Gibson KM, Morishita KA, Lee AHY, Falsafi R, Graham J, Foell D, Benseler SM, Ross CJ, Luqmani RA, Cabral DA, Hancock REW, Brown KL. Different Disease Endotypes in Phenotypically Similar Vasculitides Affecting Small-to-Medium Sized Blood Vessels. Front Immunol 2021; 12:638571. [PMID: 33692808 PMCID: PMC7937946 DOI: 10.3389/fimmu.2021.638571] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2020] [Accepted: 02/01/2021] [Indexed: 11/13/2022] Open
Abstract
Objectives: Chronic primary vasculitis describes a group of complex and rare diseases that are characterized by blood vessel inflammation. Classification of vasculitis subtypes is based predominantly on the size of the involved vessels and clinical phenotype. There is a recognized need to improve classification, especially for small-to-medium sized vessel vasculitides, that, ideally, is based on the underlying biology with a view to informing treatment. Methods: We performed RNA-Seq on blood samples from children (n = 41) and from adults (n = 11) with small-to-medium sized vessel vasculitis, and used unsupervised hierarchical clustering of gene expression patterns in combination with clinical metadata to define disease subtypes. Results: Differential gene expression at the time of diagnosis separated patients into two primary endotypes that differed in the expression of ~3,800 genes in children, and ~1,600 genes in adults. These endotypes were also present during disease flares, and both adult and pediatric endotypes could be discriminated based on the expression of just 20 differentially expressed genes. Endotypes were associated with distinct biological processes, namely neutrophil degranulation and T cell receptor signaling. Conclusions: Phenotypically similar subsets of small-to-medium sized vessel vasculitis may have different mechanistic drivers involving innate vs. adaptive immune processes. Discovery of these differentiating immune features provides a mechanistic-based alternative for subclassification of vasculitis.
Collapse
Affiliation(s)
- Erin E Gill
- Department of Microbiology and Immunology, University of British Columbia, Vancouver, BC, Canada
| | - Maren L Smith
- Department of Microbiology and Immunology, University of British Columbia, Vancouver, BC, Canada
| | - Kristen M Gibson
- Department of Medical Genetics, University of British Columbia, Vancouver, BC, Canada.,BC Children's Hospital Research Institute, Vancouver, BC, Canada
| | - Kimberly A Morishita
- Department of Pediatrics, University of British Columbia, Vancouver, BC, Canada.,BC Children's Hospital, Vancouver, BC, Canada
| | - Amy H Y Lee
- Department of Microbiology and Immunology, University of British Columbia, Vancouver, BC, Canada
| | - Reza Falsafi
- Department of Microbiology and Immunology, University of British Columbia, Vancouver, BC, Canada
| | - Jinko Graham
- Department of Statistics and Actuarial Science, Simon Fraser University, Burnaby, BC, Canada
| | - Dirk Foell
- Department of Pediatric Rheumatology and Immunology, University Hospital Muenster, Muenster, Germany
| | - Susanne M Benseler
- Department of Pediatrics, Alberta Children's Hospital, Calgary, AB, Canada
| | - Colin J Ross
- Department of Medical Genetics, University of British Columbia, Vancouver, BC, Canada.,Faculty of Pharmaceutical Sciences, University of British Columbia, Vancouver, BC, Canada
| | - Raashid A Luqmani
- Nuffield Department of Orthopedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, United Kingdom
| | - David A Cabral
- Department of Pediatrics, University of British Columbia, Vancouver, BC, Canada.,BC Children's Hospital, Vancouver, BC, Canada
| | - Robert E W Hancock
- Department of Microbiology and Immunology, University of British Columbia, Vancouver, BC, Canada.,Centre for Microbial Diseases and Immunity Research, University of British Columbia, Vancouver, BC, Canada.,Centre for Blood Research, University of British Columbia, Vancouver, BC, Canada
| | - Kelly L Brown
- BC Children's Hospital Research Institute, Vancouver, BC, Canada.,Department of Pediatrics, University of British Columbia, Vancouver, BC, Canada.,Centre for Blood Research, University of British Columbia, Vancouver, BC, Canada
| | | |
Collapse
|
47
|
Abstract
BACKGROUND How recurrent traumatic brain injury (rTBI) alters brain function years after insult is largely unknown. This study aims to characterize the mechanistic cause for long-term brain deterioration following rTBI using a rat model. METHODS Eighteen Sprague-Dawley wild-type rats underwent bilateral rTBI using a direct skull impact device or sham treatment, once per week for 5 weeks, and were euthanized 56 weeks after the first injury. Weekly rotarod performance measured motor deficits. Beam walk and grip strength were also assessed. Brain tissue were stained and volume was computed using Stereo Investigator's Cavalieri Estimator. The L5 cortical layer proximal to the injury site was microdissected and submitted for sequencing with count analyzed using R "DESeq2" and "GOStats." Brain-derived neurotrophic factor (BDNF) levels were determined using enzyme-linked immunosorbent assay. RESULTS Rotarod data demonstrated permanent deficits 1 year after rTBI. Decreased beam walk performance and grip strength was noted among rTBI rodents. Recurrent traumatic brain injury led to thinner cortex and thinner corpus callosum, enlarged ventricles, and differential expression of 72 genes (25 upregulated, 47 downregulated) including dysregulation of those associated with TBI (BDNF, NR4A1/2/3, Arc, and Egr) and downregulation in pathways associated with neuroprotection and neuroplasticity. Over the course of the study, BDNF levels decreased in both rTBI and sham rodents, and at each time point, the decrease in BDNF was more pronounced after rTBI. CONCLUSION Recurrent traumatic brain injury causes significant long-term alteration in brain health leading to permanent motor deficits, cortical and corpus callosum thinning, and expansion of the lateral ventricles. Gene expression and BDNF analysis suggest a significant drop in pathways associated with neuroplasticity and neuroprotection. Although rTBI may not cause immediate neurological abnormalities, continued brain deterioration occurs after the initial trauma in part due to a decline in genes associated with neuroplasticity and neuroprotection.
Collapse
|
48
|
Badri M, Kurtz ZD, Bonneau R, Müller CL. Shrinkage improves estimation of microbial associations under different normalization methods. NAR Genom Bioinform 2020; 2:lqaa100. [PMID: 33575644 PMCID: PMC7745771 DOI: 10.1093/nargab/lqaa100] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2020] [Revised: 10/27/2020] [Accepted: 11/10/2020] [Indexed: 12/13/2022] Open
Abstract
Estimation of statistical associations in microbial genomic survey count data is fundamental to microbiome research. Experimental limitations, including count compositionality, low sample sizes and technical variability, obstruct standard application of association measures and require data normalization prior to statistical estimation. Here, we investigate the interplay between data normalization, microbial association estimation and available sample size by leveraging the large-scale American Gut Project (AGP) survey data. We analyze the statistical properties of two prominent linear association estimators, correlation and proportionality, under different sample scenarios and data normalization schemes, including RNA-seq analysis workflows and log-ratio transformations. We show that shrinkage estimation, a standard statistical regularization technique, can universally improve the quality of taxon-taxon association estimates for microbiome data. We find that large-scale association patterns in the AGP data can be grouped into five normalization-dependent classes. Using microbial association network construction and clustering as downstream data analysis examples, we show that variance-stabilizing and log-ratio approaches enable the most taxonomically and structurally coherent estimates. Taken together, the findings from our reproducible analysis workflow have important implications for microbiome studies in multiple stages of analysis, particularly when only small sample sizes are available.
Collapse
Affiliation(s)
- Michelle Badri
- Department of Biology, New York University, New York, NY 10012, USA
| | | | - Richard Bonneau
- Department of Biology, New York University, New York, NY 10012, USA
- Center for Computational Biology, Flatiron Institute, Simons Foundation, New York, NY 10010, USA
- Computer Science Department, Courant Institute, New York, NY 10012, USA
| | - Christian L Müller
- Center for Computational Mathematics, Flatiron Institute, Simons Foundation, New York, NY 10010, USA
- Institute of Computational Biology, Helmholtz Zentrum München, Neuherberg 85764, Germany
- Department of Statistics, Ludwig-Maximilians-Universität München, Munich 80539, Germany
| |
Collapse
|
49
|
SRRM4 Expands the Repertoire of Circular RNAs by Regulating Microexon Inclusion. Cells 2020; 9:cells9112488. [PMID: 33207694 PMCID: PMC7697094 DOI: 10.3390/cells9112488] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2020] [Revised: 10/27/2020] [Accepted: 11/13/2020] [Indexed: 12/25/2022] Open
Abstract
High-throughput RNA sequencing (RNA-seq) and dedicated bioinformatics pipelines have synergized to identify an expansive repertoire of unique circular RNAs (circRNAs), exceeding 100,000 variants. While the vast majority of these circRNAs comprise canonical exonic and intronic sequences, microexons (MEs)-which occur in 30% of functional mRNA transcripts-have been entirely overlooked. CircRNAs which contain these known MEs (ME-circRNAs) could be identified with commonly utilized circRNA prediction pipelines, CIRCexplorer2 and CIRI2, but were not previously recognized as ME-circRNAs. In addition, when employing a bespoke bioinformatics pipeline for identifying RNA chimeras, called Hyb, we could also identify over 2000 ME-circRNAs which contain novel MEs at their backsplice junctions, that are uncalled by either CIRCexplorer2 or CIRI2. Analysis of circRNA-seq datasets from gliomas of varying clinical grades compared with matched control tissue has shown circRNAs have potential as prognostic markers for stratifying tumor from healthy tissue. Furthermore, the abundance of microexon-containing circRNAs (ME-circRNAs) between tumor and normal tissues is correlated with the expression of a splicing associated factor, Serine/arginine repetitive matrix 4 (SRRM4). Overexpressing SRRM4, known for regulating ME inclusion in mRNAs critical for neural differentiation, in human HEK293 cells resulted in the biogenesis of over 2000 novel ME-circRNAs, including ME-circEIF4G3, and changes in the abundance of many canonical circRNAs, including circSETDB2 and circLBRA. This shows SRRM4, in which its expression is correlated with poor prognosis in gliomas, acts as a bona fide circRNA biogenesis factor. Given the known roles of MEs and circRNAs in oncogenesis, the identification of these previously unrecognized ME-circRNAs further increases the complexity and functional purview of this non-coding RNA family.
Collapse
|
50
|
Benchmarking of cell type deconvolution pipelines for transcriptomics data. Nat Commun 2020; 11:5650. [PMID: 33159064 PMCID: PMC7648640 DOI: 10.1038/s41467-020-19015-1] [Citation(s) in RCA: 244] [Impact Index Per Article: 48.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2019] [Accepted: 09/16/2020] [Indexed: 01/05/2023] Open
Abstract
Many computational methods have been developed to infer cell type proportions from bulk transcriptomics data. However, an evaluation of the impact of data transformation, pre-processing, marker selection, cell type composition and choice of methodology on the deconvolution results is still lacking. Using five single-cell RNA-sequencing (scRNA-seq) datasets, we generate pseudo-bulk mixtures to evaluate the combined impact of these factors. Both bulk deconvolution methodologies and those that use scRNA-seq data as reference perform best when applied to data in linear scale and the choice of normalization has a dramatic impact on some, but not all methods. Overall, methods that use scRNA-seq data have comparable performance to the best performing bulk methods whereas semi-supervised approaches show higher error values. Moreover, failure to include cell types in the reference that are present in a mixture leads to substantially worse results, regardless of the previous choices. Altogether, we evaluate the combined impact of factors affecting the deconvolution task across different datasets and propose general guidelines to maximize its performance. Inferring cell type proportions from transcriptomics data is affected by data transformation, normalization, choice of method and the markers used. Here, the authors use single-cell RNAseq datasets to evaluate the impact of these factors and propose guidelines to maximise deconvolution performance.
Collapse
|