1
|
Jia M, Li Z, Pan M, Tao M, Lu X, Liu Y. Evaluation of immune infiltrating of thyroid cancer based on the intrinsic correlation between pair-wise immune genes. Life Sci 2020; 259:118248. [PMID: 32791153 DOI: 10.1016/j.lfs.2020.118248] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2020] [Revised: 07/09/2020] [Accepted: 08/07/2020] [Indexed: 10/23/2022]
Abstract
INTRODUCTION Unlike most mutation-driven cancers, thyroid cancer is thought to be highly dependent on changes in human hormone levels. It has become research hotspot using the change of gene expression level as a detection and diagnostic marker. The internal relationship between two genes and disease development is used to avoid the instability caused by single gene fluctuation. Aim It is possible to achieve early diagnosis in thyroid cancer during tumorigenesis and recurrence using IGPS (immune gene pairs). METHODS We extracted thyroid cancer data from The Cancer Genome Atlas (TCGA), using CIBERSORT algorithm to infiltrate out 22 immune cells types. We screened out IGPS that differ significantly between different groups, then used LinearSVC model to learn and screen features, combined with deep learning neural network model to predict benign and malignant cancer as well as patients at different groups. KEY FINDINGS There are significant differences of immune cell ratio in tumor stages and relapse samples. We screen out 42 and 64 IGPS for in normal-tumor and non-relapsed groups respectively, for example ASCC3-MAP3K7 and ATF2-SOCS5, have significant correlation in IGPS expression. Then we use the IGPS to train the tumor diagnostic classifier, obtain average AUC are both 0.99 after ten times cross-validation. SIGNIFICANCE The IGPS gives us new insight to explore immune cell infiltration of thyroid cancer, deep learning model can be further used in early diagnosis of thyroid cancer and estimation of the risk of recurrence.
Collapse
Affiliation(s)
- Meng Jia
- Thyroid Surgery, the First Affiliated Hospital of Zhengzhou University, Henan, 450052 Zhengzhou, China
| | - Zhuyao Li
- Thyroid Surgery, the First Affiliated Hospital of Zhengzhou University, Henan, 450052 Zhengzhou, China
| | - Mengjiao Pan
- Thyroid Surgery, the First Affiliated Hospital of Zhengzhou University, Henan, 450052 Zhengzhou, China
| | - Mei Tao
- Thyroid Surgery, the First Affiliated Hospital of Zhengzhou University, Henan, 450052 Zhengzhou, China
| | - Xiubo Lu
- Thyroid Surgery, the First Affiliated Hospital of Zhengzhou University, Henan, 450052 Zhengzhou, China.
| | - Yang Liu
- Department of Radiotherapy, Henan Cancer Hospital and the Affiliated Cancer Hospital of Zhengzhou University, Zhengzhou 450008, China.
| |
Collapse
|
2
|
Eghbalnia HR, Wilfinger WW, Mackey K, Chomczynski P. Coordinated analysis of exon and intron data reveals novel differential gene expression changes. Sci Rep 2020; 10:15669. [PMID: 32973253 PMCID: PMC7515875 DOI: 10.1038/s41598-020-72482-w] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2020] [Accepted: 08/24/2020] [Indexed: 12/14/2022] Open
Abstract
RNA-Seq expression analysis currently relies primarily upon exon expression data. The recognized role of introns during translation, and the presence of substantial RNA-Seq counts attributable to introns, provide the rationale for the simultaneous consideration of both exon and intron data. We describe here a method for the coordinated analysis of exon and intron data by investigating their relationship within individual genes and across samples, while taking into account changes in both variability and expression level. This coordinated analysis of exon and intron data offers strong evidence for significant differences that distinguish the profiles of the exon-only expression data from the combined exon and intron data. One advantage of our proposed method, called matched change characterization for exons and introns (MEI), is its straightforward applicability to existing archived data using small modifications to standard RNA-Seq pipelines. Using MEI, we demonstrate that when data are examined for changes in variability across control and case conditions, novel differential changes can be detected. Notably, when MEI criteria were employed in the analysis of an archived data set involving polyarthritic subjects, the number of differentially expressed genes was expanded by sevenfold. More importantly, the observed changes in exon and intron variability with statistically significant false discovery rates could be traced to specific immune pathway gene networks. The application of MEI analysis provides a strategy for incorporating the significance of exon and intron variability and further developing the role of using both exons and intron sequencing counts in studies of gene regulatory processes.
Collapse
Affiliation(s)
- Hamid R Eghbalnia
- University of Wisconsin-Madison, Madison, USA. .,University of Cincinnati, Cincinnati, USA.
| | | | - Karol Mackey
- Molecular Research Center, Inc., Cincinnati, USA
| | | |
Collapse
|
3
|
Yanamala N, Orandle MS, Kodali VK, Bishop L, Zeidler-Erdely PC, Roberts JR, Castranova V, Erdely A. Sparse Supervised Classification Methods Predict and Characterize Nanomaterial Exposures: Independent Markers of MWCNT Exposures. Toxicol Pathol 2017; 46:14-27. [PMID: 28934917 DOI: 10.1177/0192623317730575] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Recent experimental evidence indicates significant pulmonary toxicity of multiwalled carbon nanotubes (MWCNTs), such as inflammation, interstitial fibrosis, granuloma formation, and carcinogenicity. Although numerous studies explored the adverse potential of various CNTs, their comparability is often limited. This is due to differences in administered dose, physicochemical characteristics, exposure methods, and end points monitored. Here, we addressed the problem through sparse classification method, a supervised machine learning approach that can reduce the noise contained in redundant variables for discriminating among MWCNT-exposed and MWCNT-unexposed groups. A panel of proteins measured from bronchoalveolar lavage fluid (BAL) samples was used to predict exposure to various MWCNT and determine markers that are attributable to MWCNT exposure and toxicity in mice. Using sparse support vector machine-based classification technique, we identified a small subset of proteins clearly distinguishing each exposure. Macrophage-derived chemokine (MDC/CCL22), in particular, was associated with various MWCNT exposures and was independent of exposure method employed, that is, oropharyngeal aspiration versus inhalation exposure. Sustained expression of some of the selected protein markers identified also suggests their potential role in MWCNT-induced toxicity and proposes hypotheses for future mechanistic studies. Such approaches can be used more broadly for nanomaterial risk profiling studies to evaluate decisions related to dose/time-response relationships that could delineate experimental variables from exposure markers.
Collapse
Affiliation(s)
- Naveena Yanamala
- 1 Exposure Assessment Branch, Health Effects Laboratory Division, National Institute for Occupational Safety and Health, Morgantown, West Virginia, USA
| | - Marlene S Orandle
- 2 Pathology & Physiology Research Branch, Health Effects Laboratory Division, National Institute for Occupational Safety and Health, Morgantown, West Virginia, USA
| | - Vamsi K Kodali
- 2 Pathology & Physiology Research Branch, Health Effects Laboratory Division, National Institute for Occupational Safety and Health, Morgantown, West Virginia, USA
| | - Lindsey Bishop
- 2 Pathology & Physiology Research Branch, Health Effects Laboratory Division, National Institute for Occupational Safety and Health, Morgantown, West Virginia, USA
| | - Patti C Zeidler-Erdely
- 2 Pathology & Physiology Research Branch, Health Effects Laboratory Division, National Institute for Occupational Safety and Health, Morgantown, West Virginia, USA
| | - Jenny R Roberts
- 3 Allergy and Clinical Immunology Branch, Health Effects Laboratory Division, National Institute for Occupational Safety and Health, Morgantown, West Virginia, USA
| | - Vincent Castranova
- 4 Department of Pharmaceutical Sciences, West Virginia University, Morgantown, West Virginia, USA
| | - Aaron Erdely
- 2 Pathology & Physiology Research Branch, Health Effects Laboratory Division, National Institute for Occupational Safety and Health, Morgantown, West Virginia, USA
| |
Collapse
|
4
|
Jin J, Zhou S, Xu Q, An J. Identification of risk factors in epidemiologic study based on ROC curve and network. Sci Rep 2017; 7:46655. [PMID: 28436477 DOI: 10.1038/srep46655] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2016] [Accepted: 03/28/2017] [Indexed: 11/13/2022] Open
Abstract
This article proposes a new non-parametric approach for identification of risk factors and their correlations in epidemiologic study, in which investigation data may have high variations because of individual differences or correlated risk factors. First, based on classification information of high or low disease incidence, we estimate Receptor Operating Characteristic (ROC) curve of each risk factor. Then, through the difference between ROC curve of each factor and diagonal, we evaluate and screen for the important risk factors. In addition, based on the difference of ROC curves corresponding to any pair of factors, we define a new type of correlation matrix to measure their correlations with disease, and then use this matrix as adjacency matrix to construct a network as a visualization tool for exploring the structure among factors, which can be used to direct further studies. Finally, these methods are applied to analysis on water pollutants and gastrointestinal tumor, and analysis on gene expression data in tumor and normal colon tissue samples.
Collapse
|
5
|
Hu Y, Hase T, Li HP, Prabhakar S, Kitano H, Ng SK, Ghosh S, Wee LJK. A machine learning approach for the identification of key markers involved in brain development from single-cell transcriptomic data. BMC Genomics 2016; 17:1025. [PMID: 28155657 PMCID: PMC5260093 DOI: 10.1186/s12864-016-3317-7] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND The ability to sequence the transcriptomes of single cells using single-cell RNA-seq sequencing technologies presents a shift in the scientific paradigm where scientists, now, are able to concurrently investigate the complex biology of a heterogeneous population of cells, one at a time. However, till date, there has not been a suitable computational methodology for the analysis of such intricate deluge of data, in particular techniques which will aid the identification of the unique transcriptomic profiles difference between the different cellular subtypes. In this paper, we describe the novel methodology for the analysis of single-cell RNA-seq data, obtained from neocortical cells and neural progenitor cells, using machine learning algorithms (Support Vector machine (SVM) and Random Forest (RF)). RESULTS Thirty-eight key transcripts were identified, using the SVM-based recursive feature elimination (SVM-RFE) method of feature selection, to best differentiate developing neocortical cells from neural progenitor cells in the SVM and RF classifiers built. Also, these genes possessed a higher discriminative power (enhanced prediction accuracy) as compared commonly used statistical techniques or geneset-based approaches. Further downstream network reconstruction analysis was carried out to unravel hidden general regulatory networks where novel interactions could be further validated in web-lab experimentation and be useful candidates to be targeted for the treatment of neuronal developmental diseases. CONCLUSION This novel approach reported for is able to identify transcripts, with reported neuronal involvement, which optimally differentiate neocortical cells and neural progenitor cells. It is believed to be extensible and applicable to other single-cell RNA-seq expression profiles like that of the study of the cancer progression and treatment within a highly heterogeneous tumour.
Collapse
Affiliation(s)
- Yongli Hu
- Institute for Infocomm Research, A*STAR, 1 Fusionopolis Way, #21-01 Connexis (South Tower), Singapore, Singapore
- The Systems Biology Institute, Singapore Node hosted at the Institute for Infocomm Research, A*STAR, Singapore, Singapore
| | - Takeshi Hase
- The Systems Biology Institute, Falcon Building 5 F, 5-6-9 Shirokanedai, Minato, Tokyo, 108-0071 Japan
| | - Hui Peng Li
- Computational and Systems Biology, Genome Institute of Singapore, A*STAR, 60 Biopolis Street, Genome, #02-01, Singapore, 138672 Singapore
| | - Shyam Prabhakar
- Computational and Systems Biology, Genome Institute of Singapore, A*STAR, 60 Biopolis Street, Genome, #02-01, Singapore, 138672 Singapore
| | - Hiroaki Kitano
- The Systems Biology Institute, Falcon Building 5 F, 5-6-9 Shirokanedai, Minato, Tokyo, 108-0071 Japan
| | - See Kiong Ng
- Institute for Infocomm Research, A*STAR, 1 Fusionopolis Way, #21-01 Connexis (South Tower), Singapore, Singapore
| | - Samik Ghosh
- The Systems Biology Institute, Falcon Building 5 F, 5-6-9 Shirokanedai, Minato, Tokyo, 108-0071 Japan
| | - Lawrence Jin Kiat Wee
- Institute for Infocomm Research, A*STAR, 1 Fusionopolis Way, #21-01 Connexis (South Tower), Singapore, Singapore
- The Systems Biology Institute, Singapore Node hosted at the Institute for Infocomm Research, A*STAR, Singapore, Singapore
| |
Collapse
|
6
|
Khaledi A, Schniederjans M, Pohl S, Rainer R, Bodenhofer U, Xia B, Klawonn F, Bruchmann S, Preusse M, Eckweiler D, Dötsch A, Häussler S. Transcriptome Profiling of Antimicrobial Resistance in Pseudomonas aeruginosa. Antimicrob Agents Chemother 2016; 60:4722-33. [PMID: 27216077 DOI: 10.1128/AAC.00075-16] [Citation(s) in RCA: 52] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2016] [Accepted: 05/19/2016] [Indexed: 11/20/2022] Open
Abstract
Emerging resistance to antimicrobials and the lack of new antibiotic drug candidates underscore the need for optimization of current diagnostics and therapies to diminish the evolution and spread of multidrug resistance. As the antibiotic resistance status of a bacterial pathogen is defined by its genome, resistance profiling by applying next-generation sequencing (NGS) technologies may in the future accomplish pathogen identification, prompt initiation of targeted individualized treatment, and the implementation of optimized infection control measures. In this study, qualitative RNA sequencing was used to identify key genetic determinants of antibiotic resistance in 135 clinical Pseudomonas aeruginosa isolates from diverse geographic and infection site origins. By applying transcriptome-wide association studies, adaptive variations associated with resistance to the antibiotic classes fluoroquinolones, aminoglycosides, and β-lactams were identified. Besides potential novel biomarkers with a direct correlation to resistance, global patterns of phenotype-associated gene expression and sequence variations were identified by predictive machine learning approaches. Our research serves to establish genotype-based molecular diagnostic tools for the identification of the current resistance profiles of bacterial pathogens and paves the way for faster diagnostics for more efficient, targeted treatment strategies to also mitigate the future potential for resistance evolution.
Collapse
|