1
|
Fawaz A, Ferraresi A, Isidoro C. Systems Biology in Cancer Diagnosis Integrating Omics Technologies and Artificial Intelligence to Support Physician Decision Making. J Pers Med 2023; 13:1590. [PMID: 38003905 PMCID: PMC10672164 DOI: 10.3390/jpm13111590] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 11/07/2023] [Accepted: 11/08/2023] [Indexed: 11/26/2023] Open
Abstract
Cancer is the second major cause of disease-related death worldwide, and its accurate early diagnosis and therapeutic intervention are fundamental for saving the patient's life. Cancer, as a complex and heterogeneous disorder, results from the disruption and alteration of a wide variety of biological entities, including genes, proteins, mRNAs, miRNAs, and metabolites, that eventually emerge as clinical symptoms. Traditionally, diagnosis is based on clinical examination, blood tests for biomarkers, the histopathology of a biopsy, and imaging (MRI, CT, PET, and US). Additionally, omics biotechnologies help to further characterize the genome, metabolome, microbiome traits of the patient that could have an impact on the prognosis and patient's response to the therapy. The integration of all these data relies on gathering of several experts and may require considerable time, and, unfortunately, it is not without the risk of error in the interpretation and therefore in the decision. Systems biology algorithms exploit Artificial Intelligence (AI) combined with omics technologies to perform a rapid and accurate analysis and integration of patient's big data, and support the physician in making diagnosis and tailoring the most appropriate therapeutic intervention. However, AI is not free from possible diagnostic and prognostic errors in the interpretation of images or biochemical-clinical data. Here, we first describe the methods used by systems biology for combining AI with omics and then discuss the potential, challenges, limitations, and critical issues in using AI in cancer research.
Collapse
Affiliation(s)
| | | | - Ciro Isidoro
- Laboratory of Molecular Pathology, Department of Health Sciences, Università del Piemonte Orientale, 28100 Novara, Italy; (A.F.); (A.F.)
| |
Collapse
|
2
|
Tapak L, Ghasemi MK, Afshar S, Mahjub H, Soltanian A, Khotanlou H. Identification of gene profiles related to the development of oral cancer using a deep learning technique. BMC Med Genomics 2023; 16:35. [PMID: 36849997 PMCID: PMC9972685 DOI: 10.1186/s12920-023-01462-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Accepted: 02/15/2023] [Indexed: 03/01/2023] Open
Abstract
BACKGROUND Oral cancer (OC) is a debilitating disease that can affect the quality of life of these patients adversely. Oral premalignant lesion patients have a high risk of developing OC. Therefore, identifying robust survival subgroups among them may significantly improve patient therapy and care. This study aimed to identify prognostic biomarkers that predict the time-to-development of OC and survival stratification for patients using state-of-the-art machine learning and deep learning. METHODS Gene expression profiles (29,096 probes) related to 86 patients from the GSE26549 dataset from the GEO repository were used. An autoencoder deep learning neural network model was used to extract features. We also used a univariate Cox regression model to select significant features obtained from the deep learning method (P < 0.05). High-risk and low-risk groups were then identified using a hierarchical clustering technique based on 100 encoded features (the number of units of the encoding layer, i.e., bottleneck of the network) from autoencoder and selected by Cox proportional hazards model and a supervised random forest (RF) classifier was used to identify gene profiles related to subtypes of OC from the original 29,096 probes. RESULTS Among 100 encoded features extracted by autoencoder, seventy features were significantly related to time-to-OC-development, based on the univariate Cox model, which was used as the inputs for the clustering of patients. Two survival risk groups were identified (P value of log-rank test = 0.003) and were used as the labels for supervised classification. The overall accuracy of the RF classifier was 0.916 over the test set, yielded 21 top genes (FUT8-DDR2-ATM-CD247-ETS1-ZEB2-COL5A2-GMAP7-CDH1-COL11A2-COL3A1-AHR-COL2A1-CHORDC1-PTP4A3-COL1A2-CCR2-PDGFRB-COL1A1-FERMT2-PIK3CB) associated with time to developing OC, selected among the original 29,096 probes. CONCLUSIONS Using deep learning, our study identified prominent transcriptional biomarkers in determining high-risk patients for developing oral cancer, which may be prognostic as significant targets for OC therapy. The identified genes may serve as potential targets for oral cancer chemoprevention. Additional validation of these biomarkers in experimental prospective and retrospective studies will launch them in OC clinics.
Collapse
Affiliation(s)
- Leili Tapak
- Department of Biostatistics, School of Public Health and Modeling of Noncommunicable Diseases Research Center, Hamadan University of Medical Sciences, Hamadan, Iran
| | - Mohammad Kazem Ghasemi
- Department of Biostatistics, School of Public Health, Hamadan University of Medical Sciences, Hamadan, Iran
| | - Saeid Afshar
- Research Center for Molecular Medicine, Hamadan University of Medical Sciences, Hamadan, Iran.
| | - Hossein Mahjub
- Department of Biostatistics, School of Public Health and Modeling of Noncommunicable Diseases Research Center, Hamadan University of Medical Sciences, Hamadan, Iran
| | - Alireza Soltanian
- Department of Biostatistics, School of Public Health and Modeling of Noncommunicable Diseases Research Center, Hamadan University of Medical Sciences, Hamadan, Iran
| | - Hassan Khotanlou
- Department of Computer Engineering, Bu-Ali Sina University, Hamadan, Iran
| |
Collapse
|
3
|
He X, Liu X, Zuo F, Shi H, Jing J. Artificial intelligence-based multi-omics analysis fuels cancer precision medicine. Semin Cancer Biol 2023; 88:187-200. [PMID: 36596352 DOI: 10.1016/j.semcancer.2022.12.009] [Citation(s) in RCA: 102] [Impact Index Per Article: 51.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Revised: 12/16/2022] [Accepted: 12/29/2022] [Indexed: 01/02/2023]
Abstract
With biotechnological advancements, innovative omics technologies are constantly emerging that have enabled researchers to access multi-layer information from the genome, epigenome, transcriptome, proteome, metabolome, and more. A wealth of omics technologies, including bulk and single-cell omics approaches, have empowered to characterize different molecular layers at unprecedented scale and resolution, providing a holistic view of tumor behavior. Multi-omics analysis allows systematic interrogation of various molecular information at each biological layer while posing tricky challenges regarding how to extract valuable insights from the exponentially increasing amount of multi-omics data. Therefore, efficient algorithms are needed to reduce the dimensionality of the data while simultaneously dissecting the mysteries behind the complex biological processes of cancer. Artificial intelligence has demonstrated the ability to analyze complementary multi-modal data streams within the oncology realm. The coincident development of multi-omics technologies and artificial intelligence algorithms has fuelled the development of cancer precision medicine. Here, we present state-of-the-art omics technologies and outline a roadmap of multi-omics integration analysis using an artificial intelligence strategy. The advances made using artificial intelligence-based multi-omics approaches are described, especially concerning early cancer screening, diagnosis, response assessment, and prognosis prediction. Finally, we discuss the challenges faced in multi-omics analysis, along with tentative future trends in this field. With the increasing application of artificial intelligence in multi-omics analysis, we anticipate a shifting paradigm in precision medicine becoming driven by artificial intelligence-based multi-omics technologies.
Collapse
Affiliation(s)
- Xiujing He
- Laboratory of Integrative Medicine, Clinical Research Center for Breast, State Key Laboratory of Biotherapy, West China Hospital, Sichuan University and Collaborative Innovation Center, Chengdu, Sichuan, PR China
| | - Xiaowei Liu
- Laboratory of Integrative Medicine, Clinical Research Center for Breast, State Key Laboratory of Biotherapy, West China Hospital, Sichuan University and Collaborative Innovation Center, Chengdu, Sichuan, PR China
| | - Fengli Zuo
- Laboratory of Integrative Medicine, Clinical Research Center for Breast, State Key Laboratory of Biotherapy, West China Hospital, Sichuan University and Collaborative Innovation Center, Chengdu, Sichuan, PR China
| | - Hubing Shi
- Laboratory of Integrative Medicine, Clinical Research Center for Breast, State Key Laboratory of Biotherapy, West China Hospital, Sichuan University and Collaborative Innovation Center, Chengdu, Sichuan, PR China
| | - Jing Jing
- Laboratory of Integrative Medicine, Clinical Research Center for Breast, State Key Laboratory of Biotherapy, West China Hospital, Sichuan University and Collaborative Innovation Center, Chengdu, Sichuan, PR China.
| |
Collapse
|
4
|
Valyaeva AA, Tikhomirova MA, Potashnikova DM, Bogomazova AN, Snigiryova GP, Penin AA, Logacheva MD, Arifulin EA, Shmakova AA, Germini D, Kachalova AI, Saidova AA, Zharikova AA, Musinova YR, Mironov AA, Vassetzky YS, Sheval EV. Ectopic expression of HIV-1 Tat modifies gene expression in cultured B cells: implications for the development of B-cell lymphomas in HIV-1-infected patients. PeerJ 2022; 10:e13986. [PMID: 36275462 PMCID: PMC9586123 DOI: 10.7717/peerj.13986] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2022] [Accepted: 08/11/2022] [Indexed: 01/19/2023] Open
Abstract
An increased frequency of B-cell lymphomas is observed in human immunodeficiency virus-1 (HIV-1)-infected patients, although HIV-1 does not infect B cells. Development of B-cell lymphomas may be potentially due to the action of the HIV-1 Tat protein, which is actively released from HIV-1-infected cells, on uninfected B cells. The exact mechanism of Tat-induced B-cell lymphomagenesis has not yet been precisely identified. Here, we ectopically expressed either Tat or its TatC22G mutant devoid of transactivation activity in the RPMI 8866 lymphoblastoid B cell line and performed a genome-wide analysis of host gene expression. Stable expression of both Tat and TatC22G led to substantial modifications of the host transcriptome, including pronounced changes in antiviral response and cell cycle pathways. We did not find any strong action of Tat on cell proliferation, but during prolonged culturing, Tat-expressing cells were displaced by non-expressing cells, indicating that Tat expression slightly inhibited cell growth. We also found an increased frequency of chromosome aberrations in cells expressing Tat. Thus, Tat can modify gene expression in cultured B cells, leading to subtle modifications in cellular growth and chromosome instability, which could promote lymphomagenesis over time.
Collapse
Affiliation(s)
- Anna A. Valyaeva
- School of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow, Russia,Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow, Russia,Department of Cell Biology and Histology, School of Biology, Lomonosov Moscow State University, Moscow, Russia
| | - Maria A. Tikhomirova
- School of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow, Russia,Koltzov Institute of Developmental Biology, Moscow, Russia
| | - Daria M. Potashnikova
- Department of Cell Biology and Histology, School of Biology, Lomonosov Moscow State University, Moscow, Russia
| | - Alexandra N. Bogomazova
- Federal Research and Clinical Center of Physical-Chemical Medicine, Moscow, Russia,Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Federal Research and Clinical Center of Physical-Chemical Medicine of Federal Medical Biological Agency, Moscow, Russia
| | | | | | - Maria D. Logacheva
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow, Russia,Skolkovo Institute of Science and Technology, Moscow, Russia
| | - Eugene A. Arifulin
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow, Russia
| | - Anna A. Shmakova
- Koltzov Institute of Developmental Biology, Moscow, Russia,UMR9018 (CNRS – Institut Gustave Roussy – Université Paris Saclay), Centre National de Recherche Scientifique, Villejuif, France, France
| | - Diego Germini
- UMR9018 (CNRS – Institut Gustave Roussy – Université Paris Saclay), Centre National de Recherche Scientifique, Villejuif, France, France
| | - Anastasia I. Kachalova
- Department of Cell Biology and Histology, School of Biology, Lomonosov Moscow State University, Moscow, Russia
| | - Aleena A. Saidova
- Department of Cell Biology and Histology, School of Biology, Lomonosov Moscow State University, Moscow, Russia,Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Engelhardt Institute of Molecular Biology, Moscow, Russia
| | - Anastasia A. Zharikova
- School of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow, Russia,Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow, Russia
| | - Yana R. Musinova
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow, Russia,Koltzov Institute of Developmental Biology, Moscow, Russia
| | - Andrey A. Mironov
- School of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow, Russia,Institute for Information Transmission Problems, Moscow, Russia
| | - Yegor S. Vassetzky
- Koltzov Institute of Developmental Biology, Moscow, Russia,UMR9018 (CNRS – Institut Gustave Roussy – Université Paris Saclay), Centre National de Recherche Scientifique, Villejuif, France, France
| | - Eugene V. Sheval
- School of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow, Russia,Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow, Russia,Department of Cell Biology and Histology, School of Biology, Lomonosov Moscow State University, Moscow, Russia
| |
Collapse
|
5
|
Aghamiri SS, Amin R, Helikar T. Recent applications of quantitative systems pharmacology and machine learning models across diseases. J Pharmacokinet Pharmacodyn 2021; 49:19-37. [PMID: 34671863 PMCID: PMC8528185 DOI: 10.1007/s10928-021-09790-9] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2021] [Accepted: 10/07/2021] [Indexed: 12/29/2022]
Abstract
Quantitative systems pharmacology (QSP) is a quantitative and mechanistic platform describing the phenotypic interaction between drugs, biological networks, and disease conditions to predict optimal therapeutic response. In this meta-analysis study, we review the utility of the QSP platform in drug development and therapeutic strategies based on recent publications (2019-2021). We gathered recent original QSP models and described the diversity of their applications based on therapeutic areas, methodologies, software platforms, and functionalities. The collection and investigation of these publications can assist in providing a repository of recent QSP studies to facilitate the discovery and further reusability of QSP models. Our review shows that the largest number of QSP efforts in recent years is in Immuno-Oncology. We also addressed the benefits of integrative approaches in this field by presenting the applications of Machine Learning methods for drug discovery and QSP models. Based on this meta-analysis, we discuss the advantages and limitations of QSP models and propose fields where the QSP approach constitutes a valuable interface for more investigations to tackle complex diseases and improve drug development.
Collapse
Affiliation(s)
- Sara Sadat Aghamiri
- Department of Biochemistry, University of Nebraska-Lincoln, Lincoln, NE, USA
| | - Rada Amin
- Department of Biochemistry, University of Nebraska-Lincoln, Lincoln, NE, USA.
| | - Tomáš Helikar
- Department of Biochemistry, University of Nebraska-Lincoln, Lincoln, NE, USA.
| |
Collapse
|
6
|
Nies HW, Mohamad MS, Zakaria Z, Chan WH, Remli MA, Nies YH. Enhanced Directed Random Walk for the Identification of Breast Cancer Prognostic Markers from Multiclass Expression Data. ENTROPY (BASEL, SWITZERLAND) 2021; 23:1232. [PMID: 34573857 PMCID: PMC8472068 DOI: 10.3390/e23091232] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/15/2021] [Revised: 09/14/2021] [Accepted: 09/16/2021] [Indexed: 12/12/2022]
Abstract
Artificial intelligence in healthcare can potentially identify the probability of contracting a particular disease more accurately. There are five common molecular subtypes of breast cancer: luminal A, luminal B, basal, ERBB2, and normal-like. Previous investigations showed that pathway-based microarray analysis could help in the identification of prognostic markers from gene expressions. For example, directed random walk (DRW) can infer a greater reproducibility power of the pathway activity between two classes of samples with a higher classification accuracy. However, most of the existing methods (including DRW) ignored the characteristics of different cancer subtypes and considered all of the pathways to contribute equally to the analysis. Therefore, an enhanced DRW (eDRW+) is proposed to identify breast cancer prognostic markers from multiclass expression data. An improved weight strategy using one-way ANOVA (F-test) and pathway selection based on the greatest reproducibility power is proposed in eDRW+. The experimental results show that the eDRW+ exceeds other methods in terms of AUC. Besides this, the eDRW+ identifies 294 gene markers and 45 pathway markers from the breast cancer datasets with better AUC. Therefore, the prognostic markers (pathway markers and gene markers) can identify drug targets and look for cancer subtypes with clinically distinct outcomes.
Collapse
Affiliation(s)
- Hui Wen Nies
- School of Computing, Faculty of Engineering, Universiti Teknologi Malaysia, Skudai 81310, Malaysia; (Z.Z.); (W.H.C.)
| | - Mohd Saberi Mohamad
- Health Data Science Lab, Department of Genetics and Genomics, College of Medical and Health Sciences, United Arab Emirates University, Al Ain 17666, United Arab Emirates;
| | - Zalmiyah Zakaria
- School of Computing, Faculty of Engineering, Universiti Teknologi Malaysia, Skudai 81310, Malaysia; (Z.Z.); (W.H.C.)
| | - Weng Howe Chan
- School of Computing, Faculty of Engineering, Universiti Teknologi Malaysia, Skudai 81310, Malaysia; (Z.Z.); (W.H.C.)
| | - Muhammad Akmal Remli
- Institute for Artificial Intelligence and Big Data, Universiti Malaysia Kelantan, Kota Bharu 16100, Malaysia;
| | - Yong Hui Nies
- Department of Anatomy, Faculty of Medicine, Universiti Kebangsaan Malaysia Medical Centre, Cheras, Kuala Lumpur 56000, Malaysia;
| |
Collapse
|
7
|
Onoguchi-Mizutani R, Kishi Y, Ogura Y, Nishimura Y, Imamachi N, Suzuki Y, Miyazaki S, Akimitsu N. Identification of novel heat shock-induced long non-coding RNA in human cells. J Biochem 2021; 169:497-505. [PMID: 33170212 DOI: 10.1093/jb/mvaa126] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2020] [Accepted: 10/27/2020] [Indexed: 12/12/2022] Open
Abstract
The heat-shock response is a crucial system for survival of organisms under heat stress. During heat-shock stress, gene expression is globally suppressed, but expression of some genes, such as chaperone genes, is selectively promoted. These selectively activated genes have critical roles in the heat-shock response, so it is necessary to discover heat-inducible genes to reveal the overall heat-shock response picture. The expression profiling of heat-inducible protein-coding genes has been well-studied, but that of non-coding genes remains unclear in mammalian systems. Here, we used RNA-seq analysis of heat shock-treated A549 cells to identify seven novel long non-coding RNAs that responded to heat shock. We focussed on CTD-2377D24.6 RNA, which is most significantly induced by heat shock, and found that the promoter region of CTD-2377D24.6 contains the binding site for transcription factor HSF1 (heat shock factor 1), which plays a central role in the heat-shock response. We confirmed that HSF1 knockdown cancelled the induction of CTD-2377D24.6 RNA upon heat shock. These results suggest that CTD-2377D24.6 RNA is a novel heat shock-inducible transcript that is transcribed by HSF1.
Collapse
Affiliation(s)
- Rena Onoguchi-Mizutani
- Isotope Science Center, The University of Tokyo, 2-11-16 Yayoi, Bunkyo-ku, Tokyo 113-0032, Japan
| | - Yoshihiro Kishi
- Isotope Science Center, The University of Tokyo, 2-11-16 Yayoi, Bunkyo-ku, Tokyo 113-0032, Japan
| | - Yoko Ogura
- Isotope Science Center, The University of Tokyo, 2-11-16 Yayoi, Bunkyo-ku, Tokyo 113-0032, Japan
| | - Yuuki Nishimura
- Isotope Science Center, The University of Tokyo, 2-11-16 Yayoi, Bunkyo-ku, Tokyo 113-0032, Japan.,Department of Medical and Life Science, Faculty of Pharmaceutical Science, Tokyo University of Science, 2669 Yamazaki, Noda-shi, Chiba 278-8510, Japan
| | - Naoto Imamachi
- Isotope Science Center, The University of Tokyo, 2-11-16 Yayoi, Bunkyo-ku, Tokyo 113-0032, Japan
| | - Yutaka Suzuki
- Department of Computational Biology, Graduate School of Frontier Sciences, The University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa-shi, Chiba 277-8562, Japan
| | - Satoru Miyazaki
- Department of Medical and Life Science, Faculty of Pharmaceutical Science, Tokyo University of Science, 2669 Yamazaki, Noda-shi, Chiba 278-8510, Japan
| | - Nobuyoshi Akimitsu
- Isotope Science Center, The University of Tokyo, 2-11-16 Yayoi, Bunkyo-ku, Tokyo 113-0032, Japan
| |
Collapse
|
8
|
Biswas N, Chakrabarti S. Artificial Intelligence (AI)-Based Systems Biology Approaches in Multi-Omics Data Analysis of Cancer. Front Oncol 2020; 10:588221. [PMID: 33154949 PMCID: PMC7591760 DOI: 10.3389/fonc.2020.588221] [Citation(s) in RCA: 60] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2020] [Accepted: 09/21/2020] [Indexed: 12/13/2022] Open
Abstract
Cancer is the manifestation of abnormalities of different physiological processes involving genes, DNAs, RNAs, proteins, and other biomolecules whose profiles are reflected in different omics data types. As these bio-entities are very much correlated, integrative analysis of different types of omics data, multi-omics data, is required to understanding the disease from the tumorigenesis to the disease progression. Artificial intelligence (AI), specifically machine learning algorithms, has the ability to make decisive interpretation of "big"-sized complex data and, hence, appears as the most effective tool for the analysis and understanding of multi-omics data for patient-specific observations. In this review, we have discussed about the recent outcomes of employing AI in multi-omics data analysis of different types of cancer. Based on the research trends and significance in patient treatment, we have primarily focused on the AI-based analysis for determining cancer subtypes, disease prognosis, and therapeutic targets. We have also discussed about AI analysis of some non-canonical types of omics data as they have the capability of playing the determiner role in cancer patient care. Additionally, we have briefly discussed about the data repositories because of their pivotal role in multi-omics data storing, processing, and analysis.
Collapse
Affiliation(s)
- Nupur Biswas
- Structural Biology and Bioinformatics Division, CSIR-Indian Institute of Chemical Biology, IICB TRUE Campus, Kolkata, India
| | - Saikat Chakrabarti
- Structural Biology and Bioinformatics Division, CSIR-Indian Institute of Chemical Biology, IICB TRUE Campus, Kolkata, India
| |
Collapse
|
9
|
Liu L, Wang G, Wang L, Yu C, Li M, Song S, Hao L, Ma L, Zhang Z. Computational identification and characterization of glioma candidate biomarkers through multi-omics integrative profiling. Biol Direct 2020; 15:10. [PMID: 32539851 PMCID: PMC7294636 DOI: 10.1186/s13062-020-00264-5] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2020] [Accepted: 06/04/2020] [Indexed: 01/13/2023] Open
Abstract
BACKGROUND Glioma is one of the most common malignant brain tumors and exhibits low resection rate and high recurrence risk. Although a large number of glioma studies powered by high-throughput sequencing technologies have led to massive multi-omics datasets, there lacks of comprehensive integration of glioma datasets for uncovering candidate biomarker genes. RESULTS In this study, we collected a large-scale assemble of multi-omics multi-cohort datasets from worldwide public resources, involving a total of 16,939 samples across 19 independent studies. Through comprehensive molecular profiling across different datasets, we revealed that PRKCG (Protein Kinase C Gamma), a brain-specific gene detectable in cerebrospinal fluid, is closely associated with glioma. Specifically, it presents lower expression and higher methylation in glioma samples compared with normal samples. PRKCG expression/methylation change from high to low is indicative of glioma progression from low-grade to high-grade and high RNA expression is suggestive of good survival. Importantly, PRKCG in combination with MGMT is effective to predict survival outcomes in a more precise manner. CONCLUSIONS PRKCG bears the great potential for glioma diagnosis, prognosis and therapy, and PRKCG-like genes may represent a set of important genes associated with different molecular mechanisms in glioma tumorigenesis. Our study indicates the importance of computational integrative multi-omics data analysis and represents a data-driven scheme toward precision tumor subtyping and accurate personalized healthcare.
Collapse
Affiliation(s)
- Lin Liu
- China National Center for Bioinformation, Beijing, 100101, China
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, China
- University of Chinese Academy of Sciences, Beijing, 100101, China
| | - Guangyu Wang
- China National Center for Bioinformation, Beijing, 100101, China
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, China
- University of Chinese Academy of Sciences, Beijing, 100101, China
- Present Address: The Methodist Hospital Research Institute, 6670 Bertner Ave, Houston, TX, 77030, USA
| | - Liguo Wang
- Division of Biomedical Statistics and Informatics, Mayo Clinic College of Medicine, Rochester, MN, 55905, USA
| | - Chunlei Yu
- China National Center for Bioinformation, Beijing, 100101, China
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, China
- University of Chinese Academy of Sciences, Beijing, 100101, China
| | - Mengwei Li
- China National Center for Bioinformation, Beijing, 100101, China
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, China
- University of Chinese Academy of Sciences, Beijing, 100101, China
| | - Shuhui Song
- China National Center for Bioinformation, Beijing, 100101, China
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, China
- University of Chinese Academy of Sciences, Beijing, 100101, China
| | - Lili Hao
- China National Center for Bioinformation, Beijing, 100101, China
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, China
- University of Chinese Academy of Sciences, Beijing, 100101, China
| | - Lina Ma
- China National Center for Bioinformation, Beijing, 100101, China.
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, China.
- University of Chinese Academy of Sciences, Beijing, 100101, China.
| | - Zhang Zhang
- China National Center for Bioinformation, Beijing, 100101, China.
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, China.
- University of Chinese Academy of Sciences, Beijing, 100101, China.
| |
Collapse
|
10
|
Naorem LD, Prakash VS, Muthaiyan M, Venkatesan A. Comprehensive analysis of dysregulated lncRNAs and their competing endogenous RNA network in triple-negative breast cancer. Int J Biol Macromol 2020; 145:429-436. [DOI: 10.1016/j.ijbiomac.2019.12.196] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2019] [Revised: 12/21/2019] [Accepted: 12/21/2019] [Indexed: 01/24/2023]
|
11
|
Mohammed A, Cui Y, Mas VR, Kamaleswaran R. Differential gene expression analysis reveals novel genes and pathways in pediatric septic shock patients. Sci Rep 2019; 9:11270. [PMID: 31375728 PMCID: PMC6677896 DOI: 10.1038/s41598-019-47703-6] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2019] [Accepted: 07/12/2019] [Indexed: 12/20/2022] Open
Abstract
Septic shock is a devastating health condition caused by uncontrolled sepsis. Advancements in high-throughput sequencing techniques have increased the number of potential genetic biomarkers under review. Multiple genetic markers and functional pathways play a part in development and progression of pediatric septic shock. We identified 53 differentially expressed pediatric septic shock biomarkers using gene expression data sampled from 181 patients admitted to the pediatric intensive care unit within the first 24 hours of their admission. The gene expression signatures showed discriminatory power between pediatric septic shock survivors and nonsurvivor types. Using functional enrichment analysis of differentially expressed genes, we validated the known genes and pathways in septic shock and identified the unexplored septic shock-related genes and functional groups. Differential gene expression analysis revealed the genes involved in the immune response, chemokine-mediated signaling, neutrophil chemotaxis, and chemokine activity and distinguished the septic shock survivor from non-survivor. The identification of the septic shock gene biomarkers may facilitate in septic shock diagnosis, treatment, and prognosis.
Collapse
Affiliation(s)
- Akram Mohammed
- University of Tennessee Health Science Center, Memphis, TN, USA
| | - Yan Cui
- University of Tennessee Health Science Center, Memphis, TN, USA
| | - Valeria R Mas
- University of Tennessee Health Science Center, Memphis, TN, USA
| | | |
Collapse
|
12
|
Polewko-Klim A, Lesiński W, Mnich K, Piliszek R, Rudnicki WR. Integration of multiple types of genetic markers for neuroblastoma may contribute to improved prediction of the overall survival. Biol Direct 2018; 13:17. [PMID: 30236139 PMCID: PMC6148774 DOI: 10.1186/s13062-018-0222-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2017] [Accepted: 08/22/2018] [Indexed: 12/14/2022] Open
Abstract
Background Modern experimental techniques deliver data sets containing profiles of tens of thousands of potential molecular and genetic markers that can be used to improve medical diagnostics. Previous studies performed with three different experimental methods for the same set of neuroblastoma patients create opportunity to examine whether augmenting gene expression profiles with information on copy number variation can lead to improved predictions of patients survival. We propose methodology based on comprehensive cross-validation protocol, that includes feature selection within cross-validation loop and classification using machine learning. We also test dependence of results on the feature selection process using four different feature selection methods. Results The models utilising features selected based on information entropy are slightly, but significantly, better than those using features obtained with t-test. The synergy between data on genetic variation and gene expression is possible, but not confirmed. A slight, but statistically significant, increase of the predictive power of machine learning models has been observed for models built on combined data sets. It was found while using both out of bag estimate and in cross-validation performed on a single set of variables. However, the improvement was smaller and non-significant when models were built within full cross-validation procedure that included feature selection within cross-validation loop. Good correlation between performance of the models in the internal and external cross-validation was observed, confirming the robustness of the proposed protocol and results. Conclusions We have developed a protocol for building predictive machine learning models. The protocol can provide robust estimates of the model performance on unseen data. It is particularly well-suited for small data sets. We have applied this protocol to develop prognostic models for neuroblastoma, using data on copy number variation and gene expression. We have shown that combining these two sources of information may increase the quality of the models. Nevertheless, the increase is small and larger samples are required to reduce noise and bias arising due to overfitting. Reviewers This article was reviewed by Lan Hu, Tim Beissbarth and Dimitar Vassilev.
Collapse
Affiliation(s)
- Aneta Polewko-Klim
- Institute of Informatics, University of Białystok, Konstantego Ciołkowskiego 1M, Białystok, 15-245, Poland.
| | - Wojciech Lesiński
- Institute of Informatics, University of Białystok, Konstantego Ciołkowskiego 1M, Białystok, 15-245, Poland
| | - Krzysztof Mnich
- Computational Centre, University of Białystok, Konstantego Ciołkowskiego 1M, Białystok, 15-245, Poland
| | - Radosław Piliszek
- Computational Centre, University of Białystok, Konstantego Ciołkowskiego 1M, Białystok, 15-245, Poland
| | - Witold R Rudnicki
- Institute of Informatics, University of Białystok, Konstantego Ciołkowskiego 1M, Białystok, 15-245, Poland.,Computational Centre, University of Białystok, Konstantego Ciołkowskiego 1M, Białystok, 15-245, Poland.,Interdisciplinary Centre for Mathematical and Computational Modelling, University of Warsaw, Pawlińskiego 5A, Warsaw, 02-106, Poland
| |
Collapse
|
13
|
Mohammed A, Biegert G, Adamec J, Helikar T. CancerDiscover: an integrative pipeline for cancer biomarker and cancer class prediction from high-throughput sequencing data. Oncotarget 2017; 9:2565-2573. [PMID: 29416792 PMCID: PMC5788660 DOI: 10.18632/oncotarget.23511] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2017] [Accepted: 12/09/2017] [Indexed: 11/25/2022] Open
Abstract
Accurate identification of cancer biomarkers and classification of cancer type and subtype from High Throughput Sequencing (HTS) data is a challenging problem because it requires manual processing of raw HTS data from various sequencing platforms, quality control, and normalization, which are both tedious and time-consuming. Machine learning techniques for cancer class prediction and biomarker discovery can hasten cancer detection and significantly improve prognosis. To date, great research efforts have been taken for cancer biomarker identification and cancer class prediction. However, currently available tools and pipelines lack flexibility in data preprocessing, running multiple feature selection methods and learning algorithms, therefore, developing a freely available and easy-to-use program is strongly demanded by researchers. Here, we propose CancerDiscover, an integrative open-source software pipeline that allows users to automatically and efficiently process large high-throughput raw datasets, normalize, and selects best performing features from multiple feature selection algorithms. Additionally, the integrative pipeline lets users apply different feature thresholds to identify cancer biomarkers and build various training models to distinguish different types and subtypes of cancer. The open-source software is available at https://github.com/HelikarLab/CancerDiscover and is free for use under the GPL3 license.
Collapse
Affiliation(s)
- Akram Mohammed
- Department of Biochemistry, University of Nebraska-Lincoln, Lincoln, Nebraska, United States of America
| | - Greyson Biegert
- Department of Biochemistry, University of Nebraska-Lincoln, Lincoln, Nebraska, United States of America
| | - Jiri Adamec
- Department of Biochemistry, University of Nebraska-Lincoln, Lincoln, Nebraska, United States of America
| | - Tomáš Helikar
- Department of Biochemistry, University of Nebraska-Lincoln, Lincoln, Nebraska, United States of America
| |
Collapse
|