1
|
Alsaedi S, Ogasawara M, Alarawi M, Gao X, Gojobori T. AI-powered precision medicine: utilizing genetic risk factor optimization to revolutionize healthcare. NAR Genom Bioinform 2025; 7:lqaf038. [PMID: 40330081 PMCID: PMC12051108 DOI: 10.1093/nargab/lqaf038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2024] [Revised: 02/11/2025] [Accepted: 04/17/2025] [Indexed: 05/08/2025] Open
Abstract
The convergence of artificial intelligence (AI) and biomedical data is transforming precision medicine by enabling the use of genetic risk factors (GRFs) for customized healthcare services based on individual needs. Although GRFs play an essential role in disease susceptibility, progression, and therapeutic outcomes, a gap exists in exploring their contribution to AI-powered precision medicine. This paper addresses this need by investigating the significance and potential of utilizing GRFs with AI in the medical field. We examine their applications, particularly emphasizing their impact on disease prediction, treatment personalization, and overall healthcare improvement. This review explores the application of AI algorithms to optimize the use of GRFs, aiming to advance precision medicine in disease screening, patient stratification, drug discovery, and understanding disease mechanisms. Through a variety of case studies and examples, we demonstrate the potential of incorporating GRFs facilitated by AI into medical practice, resulting in more precise diagnoses, targeted therapies, and improved patient outcomes. This review underscores the potential of GRFs, empowered by AI, to enhance precision medicine by improving diagnostic accuracy, treatment precision, and individualized healthcare solutions.
Collapse
Affiliation(s)
- Sakhaa Alsaedi
- Computer Science, Division of Computer, Electrical and Mathematical Sciences and Engineering (CEMSE), King Abdullah University of Science and Technology (KAUST), 23955-6900 Thuwal, Kingdom of Saudi Arabia
- Center of Excellence on Smart Health, King Abdullah University of Science and Technology (KAUST), 23955-6900 Thuwal, Kingdom of Saudi Arabia
- Center of Excellence for Generative AI, King Abdullah University of Science and Technology (KAUST), 23955-6900 Thuwal, Kingdom of Saudi Arabia
- College of Computer Science and Engineering (CCSE), Taibah University, 42353 Madinah, Kingdom of Saudi Arabia
| | - Michihiro Ogasawara
- Department of Internal Medicine and Rheumatology, Juntendo University, 113-8431 Tokyo, Japan
| | - Mohammed Alarawi
- Center of Excellence on Smart Health, King Abdullah University of Science and Technology (KAUST), 23955-6900 Thuwal, Kingdom of Saudi Arabia
- Center of Excellence for Generative AI, King Abdullah University of Science and Technology (KAUST), 23955-6900 Thuwal, Kingdom of Saudi Arabia
- Biological and Environmental Sciences and Engineering, King Abdullah University of Science and Technology (KAUST), 23955-6900 Thuwal, Kingdom of Saudi Arabia
| | - Xin Gao
- Computer Science, Division of Computer, Electrical and Mathematical Sciences and Engineering (CEMSE), King Abdullah University of Science and Technology (KAUST), 23955-6900 Thuwal, Kingdom of Saudi Arabia
- Center of Excellence on Smart Health, King Abdullah University of Science and Technology (KAUST), 23955-6900 Thuwal, Kingdom of Saudi Arabia
- Center of Excellence for Generative AI, King Abdullah University of Science and Technology (KAUST), 23955-6900 Thuwal, Kingdom of Saudi Arabia
| | - Takashi Gojobori
- Center of Excellence on Smart Health, King Abdullah University of Science and Technology (KAUST), 23955-6900 Thuwal, Kingdom of Saudi Arabia
- Center of Excellence for Generative AI, King Abdullah University of Science and Technology (KAUST), 23955-6900 Thuwal, Kingdom of Saudi Arabia
- Biological and Environmental Sciences and Engineering, King Abdullah University of Science and Technology (KAUST), 23955-6900 Thuwal, Kingdom of Saudi Arabia
- Marine Open Innovation Institute (MaOI), 113-8431 Shizuoka, Japan
| |
Collapse
|
2
|
Zeng J, Zhou H, Wan H, Yang J. Single-cell omics: moving towards a new era in ischemic stroke research. Eur J Pharmacol 2025; 1000:177725. [PMID: 40350018 DOI: 10.1016/j.ejphar.2025.177725] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2024] [Revised: 05/08/2025] [Accepted: 05/09/2025] [Indexed: 05/14/2025]
Abstract
Ischemic stroke (IS) is a highly complex and heterogeneous disease involving multiple pathophysiological events. A better understanding of the pathophysiology of IS will enhance preventive, diagnostic and therapeutic strategies. Despite significant advances in modern medicine, the molecular mechanisms of IS are still largely unknown. The high-throughput omics approach opens new avenues for identifying IS biomarkers and elucidating disease pathogenesis mechanisms. Single-cell omics enables a more thorough and in-depth analysis of the cellular interactions and properties in IS. This will lead to a better understanding of the onset, treatment and prognosis of IS. In this paper, we first reviewed the disease signatures and mechanisms research of IS. Subsequently, the use of single-cell omics to comprehend the mechanisms of IS was discussed, along with some recent developments in the field. To further delineate the upstream pathogenic alterations and downstream molecular impacts of IS, we also discussed the current use of machine learning approaches to single-cell omics data analysis. Particularly, single-cell omics is being used to inform risk assessment, early patient diagnosis and treatment strategies, and their potential impact on precision medicine. Thus, we summarized the role of single-cell omics in precision medicine. Despite the relative youth of the field, the development of single-cell omics promises to provide a powerful tool for elucidating the pathogenesis of IS.
Collapse
Affiliation(s)
- Jieqiong Zeng
- School of Basic Medicine Sciences, Zhejiang Chinese Medical University, Hangzhou 310053, China; School of Ecological and Environmental, Hubei Industrial Polytechnic, Shiyan, 442000, China
| | - Huifen Zhou
- School of Basic Medicine Sciences, Zhejiang Chinese Medical University, Hangzhou 310053, China
| | - Haitong Wan
- School of Basic Medicine Sciences, Zhejiang Chinese Medical University, Hangzhou 310053, China.
| | - Jiehong Yang
- School of Basic Medicine Sciences, Zhejiang Chinese Medical University, Hangzhou 310053, China.
| |
Collapse
|
3
|
PANG GUANTING, LI YAOHAN, SHI QIWEN, TIAN JINGKUI, LOU HANMEI, FENG YUE. Omics sciences for cervical cancer precision medicine from the perspective of the tumor immune microenvironment. Oncol Res 2025; 33:821-836. [PMID: 40191729 PMCID: PMC11964870 DOI: 10.32604/or.2024.053772] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2024] [Accepted: 08/01/2024] [Indexed: 04/09/2025] Open
Abstract
Immunotherapies have demonstrated notable clinical benefits in the treatment of cervical cancer (CC). However, the development of therapeutic resistance and diverse adverse effects in immunotherapy stem from complex interactions among biological processes and factors within the tumor immune microenvironment (TIME). Advanced omic technologies offer novel insights into a more expansive and thorough layer of the TIME. Furthermore, integrating multidimensional omics within the frameworks of systems biology and computational methodologies facilitates the generation of interpretable data outputs to characterize the clinical and biological trajectories of tumor behavior. In this review, we present advanced omics technologies that utilize various clinical samples to address scientific inquiries related to immunotherapies for CC, highlighting their utility in identifying metastasis dissemination, recurrence risk, and therapeutic resistance in patients treated with immunotherapeutic approaches. This review elaborates on the strategy for integrating multi-omics data through artificial intelligence algorithms. Additionally, an analysis of the obstacles encountered in the multi-omics analysis process and potential avenues for future research in this domain are presented.
Collapse
Affiliation(s)
- GUANTING PANG
- College of Pharmaceutical Science, Zhejiang University of Technology, Hangzhou, 310014, China
- Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, Hangzhou, 310022, China
| | - YAOHAN LI
- College of Artificial Intelligence and Big Data for Medical Sciences, Shandong First Medical University & Shandong Academy of Medical Sciences, Jinan, 250000, China
- Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, Hangzhou, 310022, China
| | - QIWEN SHI
- Collaborative Innovation Center for Green Pharmaceuticals, Zhejiang University of Technology, Hangzhou, 310014, China
| | - JINGKUI TIAN
- Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, Hangzhou, 310022, China
| | - HANMEI LOU
- Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, Hangzhou, 310022, China
- Department of Gynecological Oncology, Zhejiang Cancer Hospital, Hangzhou, 310022, China
| | - YUE FENG
- Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, Hangzhou, 310022, China
- Department of Gynecological Oncology, Zhejiang Cancer Hospital, Hangzhou, 310022, China
| |
Collapse
|
4
|
Thapa K, Kinali M, Pei S, Luna A, Babur Ö. Strategies to include prior knowledge in omics analysis with deep neural networks. PATTERNS (NEW YORK, N.Y.) 2025; 6:101203. [PMID: 40182174 PMCID: PMC11963003 DOI: 10.1016/j.patter.2025.101203] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 04/05/2025]
Abstract
High-throughput molecular profiling technologies have revolutionized molecular biology research in the past decades. One important use of molecular data is to make predictions of phenotypes and other features of the organisms using machine learning algorithms. Deep learning models have become increasingly popular for this task due to their ability to learn complex non-linear patterns. Applying deep learning to molecular profiles, however, is challenging due to the very high dimensionality of the data and relatively small sample sizes, causing models to overfit. A solution is to incorporate biological prior knowledge to guide the learning algorithm for processing the functionally related input together. This helps regularize the models and improve their generalizability and interpretability. Here, we describe three major strategies proposed to use prior knowledge in deep learning models to make predictions based on molecular profiles. We review the related deep learning architectures, including the major ideas in relatively new graph neural networks.
Collapse
Affiliation(s)
- Kisan Thapa
- Computer Science Department, University of Massachusetts Boston, 100 Morrissey Boulevard, Boston, MA 02125, USA
| | - Meric Kinali
- Computer Science Department, University of Massachusetts Boston, 100 Morrissey Boulevard, Boston, MA 02125, USA
| | - Shichao Pei
- Computer Science Department, University of Massachusetts Boston, 100 Morrissey Boulevard, Boston, MA 02125, USA
| | - Augustin Luna
- Developmental Therapeutics Branch, Center for Cancer Research, National Cancer Institute, NIH, 9000 Rockville Pike, Bathesda, MD 20892, USA
- Computational Biology Branch, National Library of Medicine, NIH, 9000 Rockville Pike, Bathesda, MD 20892, USA
| | - Özgün Babur
- Computer Science Department, University of Massachusetts Boston, 100 Morrissey Boulevard, Boston, MA 02125, USA
| |
Collapse
|
5
|
Gavriilidis GI, Vasileiou V, Dimitsaki S, Karakatsoulis G, Giannakakis A, Pavlopoulos GA, Psomopoulos F. APNet, an explainable sparse deep learning model to discover differentially active drivers of severe COVID-19. Bioinformatics 2025; 41:btaf063. [PMID: 39921901 PMCID: PMC11897427 DOI: 10.1093/bioinformatics/btaf063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Revised: 01/18/2025] [Accepted: 02/05/2025] [Indexed: 02/10/2025] Open
Abstract
MOTIVATION Computational analyses of bulk and single-cell omics provide translational insights into complex diseases, such as COVID-19, by revealing molecules, cellular phenotypes, and signalling patterns that contribute to unfavourable clinical outcomes. Current in silico approaches dovetail differential abundance, biostatistics, and machine learning, but often overlook nonlinear proteomic dynamics, like post-translational modifications, and provide limited biological interpretability beyond feature ranking. RESULTS We introduce APNet, a novel computational pipeline that combines differential activity analysis based on SJARACNe co-expression networks with PASNet, a biologically informed sparse deep learning model, to perform explainable predictions for COVID-19 severity. The APNet driver-pathway network ingests SJARACNe co-regulation and classification weights to aid result interpretation and hypothesis generation. APNet outperforms alternative models in patient classification across three COVID-19 proteomic datasets, identifying predictive drivers and pathways, including some confirmed in single-cell omics and highlighting under-explored biomarker circuitries in COVID-19. AVAILABILITY AND IMPLEMENTATION APNet's R, Python scripts, and Cytoscape methodologies are available at https://github.com/BiodataAnalysisGroup/APNet.
Collapse
Affiliation(s)
- George I Gavriilidis
- Institute of Applied Biosciences, Centre for Research and Technology Hellas, Thessaloniki, GR57001, Greece
| | - Vasileios Vasileiou
- Institute of Applied Biosciences, Centre for Research and Technology Hellas, Thessaloniki, GR57001, Greece
- Department of Molecular Biology and Genetics, Democritus University of Thrace, Alexandroupolis, GR68100, Greece
| | - Stella Dimitsaki
- Institute of Applied Biosciences, Centre for Research and Technology Hellas, Thessaloniki, GR57001, Greece
| | - Georgios Karakatsoulis
- Institute of Applied Biosciences, Centre for Research and Technology Hellas, Thessaloniki, GR57001, Greece
| | - Antonis Giannakakis
- Department of Molecular Biology and Genetics, Democritus University of Thrace, Alexandroupolis, GR68100, Greece
- University Research Institute of Maternal and Child Health and Precision Medicine, National and Kapodistrian University of Athens, Athens, GR11527, Greece
| | - Georgios A Pavlopoulos
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari, GR16672, Greece
- Center of New Biotechnologies & Precision Medicine, Department of Medicine, School of Health Sciences, National and Kapodistrian University of Athens, Athens, GR11528, Greece
| | - Fotis Psomopoulos
- Institute of Applied Biosciences, Centre for Research and Technology Hellas, Thessaloniki, GR57001, Greece
| |
Collapse
|
6
|
Kim J, Jang H, Park Y, Jung I, Jo K. ExPDrug: Integration of an interpretable neural network and knowledge graph for pathway-based drug repurposing. Comput Biol Med 2025; 187:109729. [PMID: 39884058 DOI: 10.1016/j.compbiomed.2025.109729] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2024] [Revised: 01/16/2025] [Accepted: 01/18/2025] [Indexed: 02/01/2025]
Abstract
Precision medicine aims to provide personalized therapies by analyzing patient molecular profiles, often focusing on gene expression data. However, effectively linking these data to actionable drug discovery for clinical application remains challenging. In this paper, we introduce ExPDrug, a neural network (NN) model that integrates biological pathways from transcriptomic data with a biomedical knowledge graph to facilitate pathway-based drug repurposing. ExPDrug enhances disease phenotype prediction by capturing the complex relationships between genes and pathways. Using layer-wise relevance propagation (LRP), the model interprets the contribution of each pathway using relevance scores applied in a random walk-with-restart (RWR) algorithm to prioritize potential drug candidates in the biomedical network. ExPDrug outperforms existing methods in predicting phenotypes for the three diseases and identifying drug candidates, as supported by the literature. This model offers a transformative approach for advancing precision medicine by linking transcriptomic insights directly to clinical drug repurposing, thereby potentially improving treatment strategies for complex diseases.
Collapse
Affiliation(s)
- Junku Kim
- Department of Computer Engineering, Chungbuk National University, Cheongju, Republic of Korea
| | - Hojoong Jang
- Department of Computer Engineering, Chungbuk National University, Cheongju, Republic of Korea
| | - Youngjun Park
- Department of Medical Informatics, University Medical Center Göttingen, Göttingen, Germany
| | - Inuk Jung
- School of Computer Science and Engineering, Kyungpook National University, Daegu, Republic of Korea
| | - Kyuri Jo
- Department of Computer Engineering, Chungbuk National University, Cheongju, Republic of Korea.
| |
Collapse
|
7
|
Poursaeed R, Mohammadzadeh M, Safaei AA. Survival prediction of glioblastoma patients using machine learning and deep learning: a systematic review. BMC Cancer 2024; 24:1581. [PMID: 39731064 DOI: 10.1186/s12885-024-13320-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2024] [Accepted: 12/10/2024] [Indexed: 12/29/2024] Open
Abstract
Glioblastoma Multiforme (GBM), classified as a grade IV glioma by the World Health Organization (WHO), is a prevalent and notably aggressive form of brain tumor derived from glial cells. It stands as one of the most severe forms of primary brain cancer in humans. The median survival time of GBM patients is only 12-15 months, making it the most lethal type of brain tumor. Every year, about 200,000 people worldwide succumb to this disease. GBM is also highly heterogeneous, meaning that its characteristics and behavior vary widely among different patients. This leads to different outcomes and survival times for each individual. Predicting the survival of GBM patients accurately can have multiple benefits. It can enable optimal and personalized treatment planning based on the patient's condition and prognosis. It can also support the patients and their families to cope with the possible outcomes and make informed decisions about their care and quality of life. Furthermore, it can assist the researchers and scientists to discover the most relevant biomarkers, features, and mechanisms of the disease and to design more effective and personalized therapies. Artificial intelligence methods, such as machine learning and deep learning, have been widely applied to survival prediction in various fields, such as breast cancer, lung cancer, gastric cancer, cervical cancer, liver cancer, prostate cancer, and covid 19. This systematic review summarizes the current state-of-the-art methods for predicting glioblastoma survival using different types of input data, such as clinical features, molecular markers, imaging features, radiomics features, omics data or a combination of them. Following PRISMA guidelines, we searched databases from 2015 to 2024, reviewing 107 articles meeting our criteria. We analyzed the data sources, methods, performance metrics and outcomes of the studies. We found that random forest was the most popular method, and a combination of radiomics and clinical data was the most common input data.
Collapse
Affiliation(s)
- Roya Poursaeed
- Department of Data Science, Faculty of Interdisciplinary Science and Technology, Tarbiat Modares University, Tehran, Iran
| | - Mohsen Mohammadzadeh
- Department of Data Science, Faculty of Interdisciplinary Science and Technology, Tarbiat Modares University, Tehran, Iran.
- Department of Statistics, Faculty of Mathematical Sciences, Tarbiat Modares University, Tehran, Iran.
| | - Ali Asghar Safaei
- Department of Data Science, Faculty of Interdisciplinary Science and Technology, Tarbiat Modares University, Tehran, Iran.
- Department of Medical Informatics, Faculty of Medical Sciences, Tarbiat Modares University, Tehran, Iran.
| |
Collapse
|
8
|
Wang FA, Li Y, Zeng T. Deep Learning of radiology-genomics integration for computational oncology: A mini review. Comput Struct Biotechnol J 2024; 23:2708-2716. [PMID: 39035833 PMCID: PMC11260400 DOI: 10.1016/j.csbj.2024.06.019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Revised: 06/18/2024] [Accepted: 06/18/2024] [Indexed: 07/23/2024] Open
Abstract
In the field of computational oncology, patient status is often assessed using radiology-genomics, which includes two key technologies and data, such as radiology and genomics. Recent advances in deep learning have facilitated the integration of radiology-genomics data, and even new omics data, significantly improving the robustness and accuracy of clinical predictions. These factors are driving artificial intelligence (AI) closer to practical clinical applications. In particular, deep learning models are crucial in identifying new radiology-genomics biomarkers and therapeutic targets, supported by explainable AI (xAI) methods. This review focuses on recent developments in deep learning for radiology-genomics integration, highlights current challenges, and outlines some research directions for multimodal integration and biomarker discovery of radiology-genomics or radiology-omics that are urgently needed in computational oncology.
Collapse
Affiliation(s)
- Feng-ao Wang
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou 310024, China
| | - Yixue Li
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou 310024, China
- Guangzhou National Laboratory, Guangzhou, China
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou Laboratory, Guangzhou Medical University, Guangzhou, China
| | - Tao Zeng
- Guangzhou National Laboratory, Guangzhou, China
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou Laboratory, Guangzhou Medical University, Guangzhou, China
| |
Collapse
|
9
|
Lu H, Rezapour M, Baha H, Khalid Khan Niazi M, Narayanan A, Nafi Gurcan M. Classification-based pathway analysis using GPNet with novel P-value computation. Brief Bioinform 2024; 26:bbaf039. [PMID: 39879387 PMCID: PMC11775473 DOI: 10.1093/bib/bbaf039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2024] [Revised: 12/19/2024] [Accepted: 01/15/2025] [Indexed: 01/31/2025] Open
Abstract
Pathway analysis plays a critical role in bioinformatics, enabling researchers to identify biological pathways associated with various conditions by analyzing gene expression data. However, the rise of large, multi-center datasets has highlighted limitations in traditional methods like Over-Representation Analysis (ORA) and Functional Class Scoring (FCS), which struggle with low signal-to-noise ratios (SNR) and large sample sizes. To tackle these challenges, we use a deep learning-based classification method, Gene PointNet, and a novel $P$-value computation approach leveraging the confusion matrix to address pathway analysis tasks. We validated our method effectiveness through a comparative study using a simulated dataset and RNA-Seq data from The Cancer Genome Atlas breast cancer dataset. Our method was benchmarked against traditional techniques (ORA, FCS), shallow machine learning models (logistic regression, support vector machine), and deep learning approaches (DeepHisCom, PASNet). The results demonstrate that GPNet outperforms these methods in low-SNR, large-sample datasets, where it remains robust and reliable, significantly reducing both Type I error and improving power. This makes our method well suited for pathway analysis in large, multi-center studies. The code can be found at https://github.com/haolu123/GPNet_pathway">https://github.com/haolu123/GPNet_pathway.
Collapse
Affiliation(s)
- Hao Lu
- Center for Artificial Intelligence Research, Wake Forest University School of Medicine, Winston-Salem, NC 27101, United States
| | - Mostafa Rezapour
- Center for Artificial Intelligence Research, Wake Forest University School of Medicine, Winston-Salem, NC 27101, United States
| | - Haseebullah Baha
- School of Systems Biology, College of Science, George Mason University, Fairfax, VA 22030, United States
| | | | - Aarthi Narayanan
- Department of Biology, George Mason University, Fairfax, VA 22030, United States
| | - Metin Nafi Gurcan
- Center for Artificial Intelligence Research, Wake Forest University School of Medicine, Winston-Salem, NC 27101, United States
| |
Collapse
|
10
|
Rodríguez Mallma MJ, Zuloaga-Rotta L, Borja-Rosales R, Rodríguez Mallma JR, Vilca-Aguilar M, Salas-Ojeda M, Mauricio D. Explainable Machine Learning Models for Brain Diseases: Insights from a Systematic Review. Neurol Int 2024; 16:1285-1307. [PMID: 39585057 PMCID: PMC11587041 DOI: 10.3390/neurolint16060098] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2024] [Revised: 10/10/2024] [Accepted: 10/23/2024] [Indexed: 11/26/2024] Open
Abstract
In recent years, Artificial Intelligence (AI) methods, specifically Machine Learning (ML) models, have been providing outstanding results in different areas of knowledge, with the health area being one of its most impactful fields of application. However, to be applied reliably, these models must provide users with clear, simple, and transparent explanations about the medical decision-making process. This systematic review aims to investigate the use and application of explainability in ML models used in brain disease studies. A systematic search was conducted in three major bibliographic databases, Web of Science, Scopus, and PubMed, from January 2014 to December 2023. A total of 133 relevant studies were identified and analyzed out of a total of 682 found in the initial search, in which the explainability of ML models in the medical context was studied, identifying 11 ML models and 12 explainability techniques applied in the study of 20 brain diseases.
Collapse
Affiliation(s)
- Mirko Jerber Rodríguez Mallma
- Facultad de Ingeniería Industrial y de Sistemas, Universidad Nacional de Ingeniería, Lima 15333, Peru; (M.J.R.M.); (L.Z.-R.)
| | - Luis Zuloaga-Rotta
- Facultad de Ingeniería Industrial y de Sistemas, Universidad Nacional de Ingeniería, Lima 15333, Peru; (M.J.R.M.); (L.Z.-R.)
| | - Rubén Borja-Rosales
- Facultad de Ingeniería Industrial y de Sistemas, Universidad Nacional de Ingeniería, Lima 15333, Peru; (M.J.R.M.); (L.Z.-R.)
| | - Josef Renato Rodríguez Mallma
- Facultad de Ingeniería Industrial y de Sistemas, Universidad Nacional de Ingeniería, Lima 15333, Peru; (M.J.R.M.); (L.Z.-R.)
| | | | - María Salas-Ojeda
- Facultad de Artes y Humanidades, Universidad San Ignacio de Loyola, Lima 15024, Peru
| | - David Mauricio
- Facultad de Ingeniería de Sistemas e Informática, Universidad Nacional Mayor de San Marcos, Lima 15081, Peru;
| |
Collapse
|
11
|
Tian L, Xiao J, Yu T. A robust statistical approach for finding informative spatially associated pathways. Brief Bioinform 2024; 25:bbae543. [PMID: 39451157 PMCID: PMC11503753 DOI: 10.1093/bib/bbae543] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2024] [Revised: 08/27/2024] [Accepted: 10/13/2024] [Indexed: 10/26/2024] Open
Abstract
Spatial transcriptomics offers deep insights into cellular functional localization and communication by mapping gene expression to spatial locations. Traditional approaches that focus on selecting spatially variable genes often overlook the complexity of biological pathways and the interactions among genes. Here, we introduce a novel framework that shifts the focus towards directly identifying functional pathways associated with spatial variability by adapting the Brownian distance covariance test in an innovative manner to explore the heterogeneity of biological functions over space. Unlike most other methods, this statistical testing approach is free of gene selection and parameter selection and allows nonlinear and complex dependencies. It allows for a deeper understanding of how cells coordinate their activities across different spatial domains through biological pathways. By analyzing real human and mouse datasets, the method found significant pathways that were associated with spatial variation, as well as different pathway patterns among inner- and edge-cancer regions. This innovative framework offers a new perspective on analyzing spatial transcriptomic data, contributing to our understanding of tissue architecture and disease pathology. The implementation is publicly available at https://github.com/tianlq-prog/STpathway.
Collapse
Affiliation(s)
- Leqi Tian
- School of Data Science, The Chinese University of Hong Kong, Shenzhen (CUHK-Shenzhen), Shenzhen, Guangdong 518172, P.R. China
- Shenzhen Research Institute of Big Data, Shenzhen, Guangdong 518172, P.R. China
| | - Jiashun Xiao
- Shenzhen Research Institute of Big Data, Shenzhen, Guangdong 518172, P.R. China
| | - Tianwei Yu
- School of Data Science, The Chinese University of Hong Kong, Shenzhen (CUHK-Shenzhen), Shenzhen, Guangdong 518172, P.R. China
- Shenzhen Research Institute of Big Data, Shenzhen, Guangdong 518172, P.R. China
| |
Collapse
|
12
|
Marullo G, Ulrich L, Antonaci FG, Audisio A, Aprato A, Massè A, Vezzetti E. Classification of AO/OTA 31A/B femur fractures in X-ray images using YOLOv8 and advanced data augmentation techniques. Bone Rep 2024; 22:101801. [PMID: 39324016 PMCID: PMC11422035 DOI: 10.1016/j.bonr.2024.101801] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/26/2024] [Revised: 08/20/2024] [Accepted: 09/05/2024] [Indexed: 09/27/2024] Open
Abstract
Femur fractures are a significant worldwide public health concern that affects patients as well as their families because of their high frequency, morbidity, and mortality. When employing computer-aided diagnostic (CAD) technologies, promising results have been shown in the efficiency and accuracy of fracture classification, particularly with the growing use of Deep Learning (DL) approaches. Nevertheless, the complexity is further increased by the need to collect enough input data to train these algorithms and the challenge of interpreting the findings. By improving on the results of the most recent deep learning-based Arbeitsgemeinschaft für Osteosynthesefragen and Orthopaedic Trauma Association (AO/OTA) system classification of femur fractures, this study intends to support physicians in making correct and timely decisions regarding patient care. A state-of-the-art architecture, YOLOv8, was used and refined while paying close attention to the interpretability of the model. Furthermore, data augmentation techniques were involved during preprocessing, increasing the dataset samples through image processing alterations. The fine-tuned YOLOv8 model achieved remarkable results, with 0.9 accuracy, 0.85 precision, 0.85 recall, and 0.85 F1-score, computed by averaging the values among all the individual classes for each metric. This study shows the proposed architecture's effectiveness in enhancing the AO/OTA system's classification of femur fractures, assisting physicians in making prompt and accurate diagnoses.
Collapse
Affiliation(s)
- Giorgia Marullo
- Department of Management, Production, and Design, Politecnico di Torino, C.so Duca degli Abruzzi, 24, Torino 10129, Italy
| | - Luca Ulrich
- Department of Management, Production, and Design, Politecnico di Torino, C.so Duca degli Abruzzi, 24, Torino 10129, Italy
| | - Francesca Giada Antonaci
- Department of Management, Production, and Design, Politecnico di Torino, C.so Duca degli Abruzzi, 24, Torino 10129, Italy
| | - Andrea Audisio
- Pediatric Orthopaedics and Traumatology, Regina Margherita Children's Hospital, Torino 10126, Italy
| | - Alessandro Aprato
- Department of Surgical Sciences, University of Turin, Torino 10124, Italy
| | - Alessandro Massè
- Department of Surgical Sciences, University of Turin, Torino 10124, Italy
| | - Enrico Vezzetti
- Department of Management, Production, and Design, Politecnico di Torino, C.so Duca degli Abruzzi, 24, Torino 10129, Italy
| |
Collapse
|
13
|
Ozcelik F, Dundar MS, Yildirim AB, Henehan G, Vicente O, Sánchez-Alcázar JA, Gokce N, Yildirim DT, Bingol NN, Karanfilska DP, Bertelli M, Pojskic L, Ercan M, Kellermayer M, Sahin IO, Greiner-Tollersrud OK, Tan B, Martin D, Marks R, Prakash S, Yakubi M, Beccari T, Lal R, Temel SG, Fournier I, Ergoren MC, Mechler A, Salzet M, Maffia M, Danalev D, Sun Q, Nei L, Matulis D, Tapaloaga D, Janecke A, Bown J, Cruz KS, Radecka I, Ozturk C, Nalbantoglu OU, Sag SO, Ko K, Arngrimsson R, Belo I, Akalin H, Dundar M. The impact and future of artificial intelligence in medical genetics and molecular medicine: an ongoing revolution. Funct Integr Genomics 2024; 24:138. [PMID: 39147901 DOI: 10.1007/s10142-024-01417-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2024] [Revised: 08/01/2024] [Accepted: 08/05/2024] [Indexed: 08/17/2024]
Abstract
Artificial intelligence (AI) platforms have emerged as pivotal tools in genetics and molecular medicine, as in many other fields. The growth in patient data, identification of new diseases and phenotypes, discovery of new intracellular pathways, availability of greater sets of omics data, and the need to continuously analyse them have led to the development of new AI platforms. AI continues to weave its way into the fabric of genetics with the potential to unlock new discoveries and enhance patient care. This technology is setting the stage for breakthroughs across various domains, including dysmorphology, rare hereditary diseases, cancers, clinical microbiomics, the investigation of zoonotic diseases, omics studies in all medical disciplines. AI's role in facilitating a deeper understanding of these areas heralds a new era of personalised medicine, where treatments and diagnoses are tailored to the individual's molecular features, offering a more precise approach to combating genetic or acquired disorders. The significance of these AI platforms is growing as they assist healthcare professionals in the diagnostic and treatment processes, marking a pivotal shift towards more informed, efficient, and effective medical practice. In this review, we will explore the range of AI tools available and show how they have become vital in various sectors of genomic research supporting clinical decisions.
Collapse
Affiliation(s)
- Firat Ozcelik
- Department of Medical Genetics, Faculty of Medicine, Erciyes University, Kayseri, Turkey
| | - Mehmet Sait Dundar
- Department of Electrical and Computer Engineering, Graduate School of Engineering and Sciences, Abdullah Gul University, Kayseri, Turkey
| | - A Baki Yildirim
- Department of Medical Genetics, Faculty of Medicine, Erciyes University, Kayseri, Turkey
| | - Gary Henehan
- School of Food Science and Environmental Health, Technological University of Dublin, Dublin, Ireland
| | - Oscar Vicente
- Institute for the Conservation and Improvement of Valencian Agrodiversity (COMAV), Universitat Politècnica de València, Valencia, Spain
| | - José A Sánchez-Alcázar
- Centro de Investigación Biomédica en Red: Enfermedades Raras, Centro Andaluz de Biología del Desarrollo (CABD-CSIC-Universidad Pablo de Olavide), Instituto de Salud Carlos III, Sevilla, Spain
| | - Nuriye Gokce
- Department of Medical Genetics, Faculty of Medicine, Erciyes University, Kayseri, Turkey
| | - Duygu T Yildirim
- Department of Medical Genetics, Faculty of Medicine, Erciyes University, Kayseri, Turkey
| | - Nurdeniz Nalbant Bingol
- Department of Translational Medicine, Institute of Health Sciences, Bursa Uludag University, Bursa, Turkey
| | - Dijana Plaseska Karanfilska
- Research Centre for Genetic Engineering and Biotechnology, Macedonian Academy of Sciences and Arts, Skopje, Macedonia
| | | | - Lejla Pojskic
- Institute for Genetic Engineering and Biotechnology, University of Sarajevo, Sarajevo, Bosnia and Herzegovina
| | - Mehmet Ercan
- Department of Medical Genetics, Faculty of Medicine, Erciyes University, Kayseri, Turkey
| | - Miklos Kellermayer
- Department of Biophysics and Radiation Biology, Faculty of Medicine, Semmelweis University, Budapest, Hungary
| | - Izem Olcay Sahin
- Department of Medical Genetics, Faculty of Medicine, Erciyes University, Kayseri, Turkey
| | | | - Busra Tan
- Department of Medical Genetics, Faculty of Medicine, Erciyes University, Kayseri, Turkey
| | - Donald Martin
- University Grenoble Alpes, CNRS, TIMC-IMAG/SyNaBi (UMR 5525), Grenoble, France
| | - Robert Marks
- Avram and Stella Goldstein-Goren Department of Biotechnology Engineering, Ben-Gurion University of the Negev, Be'er Sheva, Israel
| | - Satya Prakash
- Department of Biomedical Engineering, University of McGill, Montreal, QC, Canada
| | - Mustafa Yakubi
- Department of Medical Genetics, Faculty of Medicine, Erciyes University, Kayseri, Turkey
| | - Tommaso Beccari
- Department of Pharmeceutical Sciences, University of Perugia, Perugia, Italy
| | - Ratnesh Lal
- Neuroscience Research Institute, University of California, Santa Barbara, USA
| | - Sehime G Temel
- Department of Translational Medicine, Institute of Health Sciences, Bursa Uludag University, Bursa, Turkey
- Department of Medical Genetics, Bursa Uludag University Faculty of Medicine, Bursa, Turkey
- Department of Histology and Embryology, Faculty of Medicine, Bursa Uludag University, Bursa, Turkey
| | - Isabelle Fournier
- Réponse Inflammatoire et Spectrométrie de Masse-PRISM, University of Lille, Lille, France
| | - M Cerkez Ergoren
- Department of Medical Genetics, Near East University Faculty of Medicine, Nicosia, Cyprus
| | - Adam Mechler
- Department of Chemistry, La Trobe Institute for Molecular Science, La Trobe University, Melbourne, VIC, Australia
| | - Michel Salzet
- Réponse Inflammatoire et Spectrométrie de Masse-PRISM, University of Lille, Lille, France
| | - Michele Maffia
- Department of Experimental Medicine, University of Salento, Via Lecce-Monteroni, Lecce, 73100, Italy
| | - Dancho Danalev
- University of Chemical Technology and Metallurgy, Sofia, Bulgaria
| | - Qun Sun
- Department of Food Science and Technology, Sichuan University, Chengdu, China
| | - Lembit Nei
- School of Engineering Tallinn University of Technology, Tartu College, Tartu, Estonia
| | - Daumantas Matulis
- Department of Biothermodynamics and Drug Design, Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Dana Tapaloaga
- Faculty of Veterinary Medicine, University of Agronomic Sciences and Veterinary Medicine of Bucharest, Bucharest, Romania
| | - Andres Janecke
- Department of Paediatrics I, Medical University of Innsbruck, Innsbruck, Austria
- Division of Human Genetics, Medical University of Innsbruck, Innsbruck, Austria
| | - James Bown
- School of Science, Engineering and Technology, Abertay University, Dundee, UK
| | | | - Iza Radecka
- School of Science, Faculty of Science and Engineering, University of Wolverhampton, Wolverhampton, UK
| | - Celal Ozturk
- Department of Software Engineering, Erciyes University, Kayseri, Turkey
| | - Ozkan Ufuk Nalbantoglu
- Department of Computer Engineering, Engineering Faculty, Erciyes University, Kayseri, Turkey
| | - Sebnem Ozemri Sag
- Department of Medical Genetics, Bursa Uludag University Faculty of Medicine, Bursa, Turkey
| | - Kisung Ko
- Department of Medicine, College of Medicine, Chung-Ang University, Seoul, Korea
| | - Reynir Arngrimsson
- Iceland Landspitali University Hospital, University of Iceland, Reykjavik, Iceland
| | - Isabel Belo
- Centre of Biological Engineering, University of Minho, Braga, Portugal
| | - Hilal Akalin
- Department of Medical Genetics, Faculty of Medicine, Erciyes University, Kayseri, Turkey.
| | - Munis Dundar
- Department of Medical Genetics, Faculty of Medicine, Erciyes University, Kayseri, Turkey.
| |
Collapse
|
14
|
van Hilten A, van Rooij J, Ikram MA, Niessen WJ, van Meurs JBJ, Roshchupkin GV. Phenotype prediction using biologically interpretable neural networks on multi-cohort multi-omics data. NPJ Syst Biol Appl 2024; 10:81. [PMID: 39095438 PMCID: PMC11297229 DOI: 10.1038/s41540-024-00405-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Accepted: 07/12/2024] [Indexed: 08/04/2024] Open
Abstract
Integrating multi-omics data into predictive models has the potential to enhance accuracy, which is essential for precision medicine. In this study, we developed interpretable predictive models for multi-omics data by employing neural networks informed by prior biological knowledge, referred to as visible networks. These neural networks offer insights into the decision-making process and can unveil novel perspectives on the underlying biological mechanisms associated with traits and complex diseases. We tested the performance, interpretability and generalizability for inferring smoking status, subject age and LDL levels using genome-wide RNA expression and CpG methylation data from the blood of the BIOS consortium (four population cohorts, Ntotal = 2940). In a cohort-wise cross-validation setting, the consistency of the diagnostic performance and interpretation was assessed. Performance was consistently high for predicting smoking status with an overall mean AUC of 0.95 (95% CI: 0.90-1.00) and interpretation revealed the involvement of well-replicated genes such as AHRR, GPR15 and LRRN3. LDL-level predictions were only generalized in a single cohort with an R2 of 0.07 (95% CI: 0.05-0.08). Age was inferred with a mean error of 5.16 (95% CI: 3.97-6.35) years with the genes COL11A2, AFAP1, OTUD7A, PTPRN2, ADARB2 and CD34 consistently predictive. For both regression tasks, we found that using multi-omics networks improved performance, stability and generalizability compared to interpretable single omic networks. We believe that visible neural networks have great potential for multi-omics analysis; they combine multi-omic data elegantly, are interpretable, and generalize well to data from different cohorts.
Collapse
Affiliation(s)
- Arno van Hilten
- Department of Radiology and Nuclear Medicine, Erasmus MC, Rotterdam, The Netherlands.
| | - Jeroen van Rooij
- Department of Internal Medicine, Erasmus MC, Rotterdam, The Netherlands
| | - M Arfan Ikram
- Department of Imaging Physics, Delft University of Technology, Delft, The Netherlands
| | - Wiro J Niessen
- Department of Radiology and Nuclear Medicine, Erasmus MC, Rotterdam, The Netherlands
- Department of Imaging Physics, Delft University of Technology, Delft, The Netherlands
| | - Joyce B J van Meurs
- Department of Internal Medicine, Erasmus MC, Rotterdam, The Netherlands
- Department of Orthopaedics and Sports Medicine, Erasmus MC, Rotterdam, The Netherlands
| | - Gennady V Roshchupkin
- Department of Radiology and Nuclear Medicine, Erasmus MC, Rotterdam, The Netherlands
| |
Collapse
|
15
|
Ding X, Zhang L, Fan M, Li L. TME-NET: an interpretable deep neural network for predicting pan-cancer immune checkpoint inhibitor responses. Brief Bioinform 2024; 25:bbae410. [PMID: 39167797 PMCID: PMC11337220 DOI: 10.1093/bib/bbae410] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2024] [Revised: 07/17/2024] [Accepted: 08/02/2024] [Indexed: 08/23/2024] Open
Abstract
Immunotherapy with immune checkpoint inhibitors (ICIs) is increasingly used to treat various tumor types. Determining patient responses to ICIs presents a significant clinical challenge. Although components of the tumor microenvironment (TME) are used to predict patient outcomes, comprehensive assessments of the TME are frequently overlooked. Using a top-down approach, the TME was divided into five layers-outcome, immune role, cell, cellular component, and gene. Using this structure, a neural network called TME-NET was developed to predict responses to ICIs. Model parameter weights and cell ablation studies were used to investigate the influence of TME components. The model was developed and evaluated using a pan-cancer cohort of 948 patients across four cancer types, with Area Under the Curve (AUC) and accuracy as performance metrics. Results show that TME-NET surpasses established models such as support vector machine and k-nearest neighbors in AUC and accuracy. Visualization of model parameter weights showed that at the cellular layer, Th1 cells enhance immune responses, whereas myeloid-derived suppressor cells and M2 macrophages show strong immunosuppressive effects. Cell ablation studies further confirmed the impact of these cells. At the gene layer, the transcription factors STAT4 in Th1 cells and IRF4 in M2 macrophages significantly affect TME dynamics. Additionally, the cytokine-encoding genes IFNG from Th1 cells and ARG1 from M2 macrophages are crucial for modulating immune responses within the TME. Survival data from immunotherapy cohorts confirmed the prognostic ability of these markers, with p-values <0.01. In summary, TME-NET performs well in predicting immunotherapy responses and offers interpretable insights into the immunotherapy process. It can be customized at https://immbal.shinyapps.io/TME-NET.
Collapse
Affiliation(s)
- Xiaobao Ding
- Institute of Biomedical Engineering and Instrumentation, Hangzhou Dianzi University, Hangzhou 310018, Zhejiang, China
- Institute of Big Data and Artificial Intelligence in Medicine, School of Electronics and Information Engineering, Taizhou University, Taizhou 318000, Zhejiang, China
- School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, 310018, China
| | - Lin Zhang
- Institute of Biomedical Engineering and Instrumentation, Hangzhou Dianzi University, Hangzhou 310018, Zhejiang, China
| | - Ming Fan
- Institute of Biomedical Engineering and Instrumentation, Hangzhou Dianzi University, Hangzhou 310018, Zhejiang, China
| | - Lihua Li
- Institute of Biomedical Engineering and Instrumentation, Hangzhou Dianzi University, Hangzhou 310018, Zhejiang, China
- School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, 310018, China
| |
Collapse
|
16
|
Hao J, Liu Y, Mo Z, Liu X, Sun H, Li J. Integrated Multi-Omics and Whole Slide Images for Survival Prediction in Glioblastoma Using Multiple Instance Learning and Co-Attention. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2024; 2024:1-4. [PMID: 40039442 DOI: 10.1109/embc53108.2024.10782321] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/06/2025]
Abstract
Glioblastoma multiforme (GBM) is the most aggressive adult brain tumor and presents significant treatment challenges due to its poor prognosis and heterogeneity. Despite the rapid development of deep learning, integrating multi-omics with whole slide images (WSIs) for survival prediction remains difficult. This study aims to improve GBM prognosis by integrating WSIs with multi-omic data through the incorporation of biological pathway knowledge. Utilizing multiple instance learning and co-attention mechanisms, we initiated the integration of multi-omic data informed by biological pathways, leveraging existing knowledge of molecular interactions. The proposed model was evaluated using data from 214 GBM patients from The Cancer Genome Atlas. This dataset included 447 WSIs and multi-omic features such as 927 RNA sequencing gene expressions, 1,168 copy number alterations, and 1,489 DNA methylation patterns. These multi-omic features were organized into nine biological pathways, each selected based on their relevance to GBM, ensuring a targeted and biologically informed strategy for survival prediction. Our results show that the proposed model outperforms existing benchmarks by at least 4.5%, highlighting the potential of incorporating additional biological knowledge into the integration of multimodal data to improve GBM survival prediction.
Collapse
|
17
|
Tang M, Antić Ž, Fardzadeh P, Pietzsch S, Schröder C, Eberhardt A, van Bömmel A, Escherich G, Hofmann W, Horstmann MA, Illig T, McCrary JM, Lentes J, Metzler M, Nejdl W, Schlegelberger B, Schrappe M, Zimmermann M, Miarka-Walczyk K, Pastorczak A, Cario G, Renard BY, Stanulla M, Bergmann AK. An artificial intelligence-assisted clinical framework to facilitate diagnostics and translational discovery in hematologic neoplasia. EBioMedicine 2024; 104:105171. [PMID: 38810562 PMCID: PMC11154115 DOI: 10.1016/j.ebiom.2024.105171] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Revised: 05/10/2024] [Accepted: 05/15/2024] [Indexed: 05/31/2024] Open
Abstract
BACKGROUND The increasing volume and intricacy of sequencing data, along with other clinical and diagnostic data, like drug responses and measurable residual disease, creates challenges for efficient clinical comprehension and interpretation. Using paediatric B-cell precursor acute lymphoblastic leukaemia (BCP-ALL) as a use case, we present an artificial intelligence (AI)-assisted clinical framework clinALL that integrates genomic and clinical data into a user-friendly interface to support routine diagnostics and reveal translational insights for hematologic neoplasia. METHODS We performed targeted RNA sequencing in 1365 cases with haematological neoplasms, primarily paediatric B-cell precursor acute lymphoblastic leukaemia (BCP-ALL) from the AIEOP-BFM ALL study. We carried out fluorescence in situ hybridization (FISH), karyotyping and arrayCGH as part of the routine diagnostics. The analysis results of these assays as well as additional clinical information were integrated into an interactive web interface using Bokeh, where the main graph is based on Uniform Manifold Approximation and Projection (UMAP) analysis of the gene expression data. At the backend of the clinALL, we built both shallow machine learning models and a deep neural network using Scikit-learn and PyTorch respectively. FINDINGS By applying clinALL, 78% of undetermined patients under the current diagnostic protocol were stratified, and ambiguous cases were investigated. Translational insights were discovered, including IKZF1plus status dependent subpopulations of BCR::ABL1 positive patients, and a subpopulation within ETV6::RUNX1 positive patients that has a high relapse frequency. Our best machine learning models, LDA and PASNET-like neural network models, achieve F1 scores above 97% in predicting patients' subgroups. INTERPRETATION An AI-assisted clinical framework that integrates both genomic and clinical data can take full advantage of the available data, improve point-of-care decision-making and reveal clinically relevant insights promptly. Such a lightweight and easily transferable framework works for both whole transcriptome data as well as the cost-effective targeted RNA-seq, enabling efficient and equitable delivery of personalized medicine in small clinics in developing countries. FUNDING German Ministry of Education and Research (BMBF), German Research Foundation (DFG) and Foundation for Polish Science.
Collapse
Affiliation(s)
- Ming Tang
- Department of Human Genetics, Hannover Medical School, Hannover, Germany; L3S Research Centre, Leibniz University Hannover, Germany
| | - Željko Antić
- Department of Human Genetics, Hannover Medical School, Hannover, Germany
| | | | - Stefan Pietzsch
- Department of Human Genetics, Hannover Medical School, Hannover, Germany
| | - Charlotte Schröder
- Department of Human Genetics, Hannover Medical School, Hannover, Germany
| | | | - Alena van Bömmel
- Leibniz Institute on Aging - Fritz Lipmann Institute (FLI), Jena, Germany
| | - Gabriele Escherich
- Clinic of Paediatric Haematology and Oncology, University Medical Centre Hamburg-Eppendorf, Hamburg, Germany
| | - Winfried Hofmann
- Department of Human Genetics, Hannover Medical School, Hannover, Germany
| | - Martin A Horstmann
- Clinic of Paediatric Haematology and Oncology, University Medical Centre Hamburg-Eppendorf, Hamburg, Germany; Research Institute Children's Cancer Centre Hamburg, Hamburg, Germany
| | - Thomas Illig
- Hannover Unified Bio Bank, Hannover Medical School, Hannover, Germany
| | - J Matt McCrary
- Department of Human Genetics, Hannover Medical School, Hannover, Germany
| | - Jana Lentes
- Department of Human Genetics, Hannover Medical School, Hannover, Germany
| | - Markus Metzler
- Department of Paediatrics, University Hospital Erlangen, Erlangen, Germany
| | - Wolfgang Nejdl
- L3S Research Centre, Leibniz University Hannover, Germany
| | | | - Martin Schrappe
- Department of Paediatrics, University Medical Centre Schleswig-Holstein, Campus Kiel, Kiel, Germany
| | - Martin Zimmermann
- Department of Paediatric Haematology and Oncology, Hannover Medical School, Hannover, Germany
| | - Karolina Miarka-Walczyk
- Department of Paediatrics, Oncology and Haematology, Medical University of Lodz, Lodz, Poland
| | - Agata Pastorczak
- Department of Paediatrics, Oncology and Haematology, Medical University of Lodz, Lodz, Poland
| | - Gunnar Cario
- Department of Paediatrics, University Medical Centre Schleswig-Holstein, Campus Kiel, Kiel, Germany
| | - Bernhard Y Renard
- Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, Potsdam, Germany
| | - Martin Stanulla
- Department of Paediatric Haematology and Oncology, Hannover Medical School, Hannover, Germany
| | | |
Collapse
|
18
|
Ko E, Kim Y, Shokoohi F, Mersha TB, Kang M. SPIN: sex-specific and pathway-based interpretable neural network for sexual dimorphism analysis. Brief Bioinform 2024; 25:bbae239. [PMID: 38807262 PMCID: PMC11133003 DOI: 10.1093/bib/bbae239] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2023] [Revised: 03/29/2024] [Accepted: 04/26/2024] [Indexed: 05/30/2024] Open
Abstract
Sexual dimorphism in prevalence, severity and genetic susceptibility exists for most common diseases. However, most genetic and clinical outcome studies are designed in sex-combined framework considering sex as a covariate. Few sex-specific studies have analyzed males and females separately, which failed to identify gene-by-sex interaction. Here, we propose a novel unified biologically interpretable deep learning-based framework (named SPIN) for sexual dimorphism analysis. We demonstrate that SPIN significantly improved the C-index up to 23.6% in TCGA cancer datasets, and it was further validated using asthma datasets. In addition, SPIN identifies sex-specific and -shared risk loci that are often missed in previous sex-combined/-separate analysis. We also show that SPIN is interpretable for explaining how biological pathways contribute to sexual dimorphism and improve risk prediction in an individual level, which can result in the development of precision medicine tailored to a specific individual's characteristics.
Collapse
Affiliation(s)
- Euiseong Ko
- Department of Computer Science, University of Nevada, Las Vegas, Las Vegas, NV, USA
| | - Youngsoon Kim
- Department of Information and Statistics and Department of Bio&Medical Bigdata (BK21 Four program), Gyeongsang National University, Jinju, Republic of Korea
| | - Farhad Shokoohi
- Department of Mathematical Sciences, University of Nevada, Las Vegas, Las Vegas, NV, USA
| | - Tesfaye B Mersha
- Department of Pediatrics, Cincinnati Children’s Hospital Medical Center, University of Cincinnati, Cincinnati, OH, USA
| | - Mingon Kang
- Department of Computer Science, University of Nevada, Las Vegas, Las Vegas, NV, USA
| |
Collapse
|
19
|
Liu X, Tao Y, Cai Z, Bao P, Ma H, Li K, Li M, Zhu Y, Lu ZJ. Pathformer: a biological pathway informed transformer for disease diagnosis and prognosis using multi-omics data. Bioinformatics 2024; 40:btae316. [PMID: 38741230 PMCID: PMC11139513 DOI: 10.1093/bioinformatics/btae316] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Revised: 03/29/2024] [Accepted: 05/11/2024] [Indexed: 05/16/2024] Open
Abstract
MOTIVATION Multi-omics data provide a comprehensive view of gene regulation at multiple levels, which is helpful in achieving accurate diagnosis of complex diseases like cancer. However, conventional integration methods rarely utilize prior biological knowledge and lack interpretability. RESULTS To integrate various multi-omics data of tissue and liquid biopsies for disease diagnosis and prognosis, we developed a biological pathway informed Transformer, Pathformer. It embeds multi-omics input with a compacted multi-modal vector and a pathway-based sparse neural network. Pathformer also leverages criss-cross attention mechanism to capture the crosstalk between different pathways and modalities. We first benchmarked Pathformer with 18 comparable methods on multiple cancer datasets, where Pathformer outperformed all the other methods, with an average improvement of 6.3%-14.7% in F1 score for cancer survival prediction, 5.1%-12% for cancer stage prediction, and 8.1%-13.6% for cancer drug response prediction. Subsequently, for cancer prognosis prediction based on tissue multi-omics data, we used a case study to demonstrate the biological interpretability of Pathformer by identifying key pathways and their biological crosstalk. Then, for cancer early diagnosis based on liquid biopsy data, we used plasma and platelet datasets to demonstrate Pathformer's potential of clinical applications in cancer screening. Moreover, we revealed deregulation of interesting pathways (e.g. scavenger receptor pathway) and their crosstalk in cancer patients' blood, providing potential candidate targets for cancer microenvironment study. AVAILABILITY AND IMPLEMENTATION Pathformer is implemented and freely available at https://github.com/lulab/Pathformer.
Collapse
Affiliation(s)
- Xiaofan Liu
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China
- Institute for Precision Medicine, Tsinghua University, Beijing 100084, China
| | - Yuhuan Tao
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China
- Institute for Precision Medicine, Tsinghua University, Beijing 100084, China
| | - Zilin Cai
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Pengfei Bao
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China
- Institute for Precision Medicine, Tsinghua University, Beijing 100084, China
| | - Hongli Ma
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China
- Institute for Precision Medicine, Tsinghua University, Beijing 100084, China
| | - Kexing Li
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Mengtao Li
- Department of Rheumatology and Clinical Immunology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences, Peking Union Medical College, National Clinical Research Center for Dermatologic and Immunologic Diseases (NCRC-DID), MST State Key Laboratory of Complex Severe and Rare Diseases, MOE Key Laboratory of Rheumatology and Clinical Immunology, Beijing 100730, China
| | - Yunping Zhu
- State Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing 102206, China
| | - Zhi John Lu
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China
- Institute for Precision Medicine, Tsinghua University, Beijing 100084, China
| |
Collapse
|
20
|
Shannon CP, Lee AH, Tebbutt SJ, Singh A. A Commentary on Multi-omics Data Integration in Systems Vaccinology. J Mol Biol 2024; 436:168522. [PMID: 38458605 DOI: 10.1016/j.jmb.2024.168522] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Revised: 03/04/2024] [Accepted: 03/04/2024] [Indexed: 03/10/2024]
Affiliation(s)
| | - Amy Hy Lee
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, Canada
| | - Scott J Tebbutt
- PROOF Centre of Excellence, Vancouver, Canada; Department of Medicine, The University of British Columbia, Vancouver, Canada; Centre for Heart Lung Innovation, Vancouver, Canada
| | - Amrit Singh
- Centre for Heart Lung Innovation, Vancouver, Canada; Department of Anesthesiology, Pharmacology and Therapeutics, The University of British Columbia, Vancouver, Canada.
| |
Collapse
|
21
|
Kumar S, Sarmah DT, Paul A, Chatterjee S. Exploration of functional relations among differentially co-expressed genes identifies regulators in glioblastoma. Comput Biol Chem 2024; 109:108024. [PMID: 38335855 DOI: 10.1016/j.compbiolchem.2024.108024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 12/15/2023] [Accepted: 02/02/2024] [Indexed: 02/12/2024]
Abstract
The conventional computational approaches to investigating a disease confront inherent constraints as they often need to improve in delving beyond protein functional associations and grasping their deeper contextual significance within the disease framework. Such context-specificity can be explored using clinical data by evaluating the change in interaction between the biological entities in different conditions by investigating the differential co-expression relationships. We believe that the integration and analysis of differential co-expression and the functional relationships, primarily focusing on the source nodes, will open novel insights about disease progression as the source proteins could trigger signaling cascades, mostly because they are transcription factors, cell surface receptors, or enzymes that respond instantly to a particular stimulus. A thorough contextual investigation of these nodes could lead to a helpful beginning point for identifying potential causal linkages and guiding subsequent scientific investigations to uncover mechanisms underlying observed associations. Our methodology includes functional protein-protein Interaction (PPI) data and co-expression information and filters functional linkages through a series of critical steps, culminating in the identification of a robust set of regulators. Our analysis identified eleven key regulators-AKT1, BRCA1, CAMK2G, CUL1, FGFR3, KIF3A, NUP210, PRKACB, RAB8A, RPS6KA2 and TGFB3-in glioblastoma. These regulators play a pivotal role in disease classification, cell growth control, and patient survivability and exhibit associations with immune infiltrations and disease hallmarks. This underscores the importance of assessing correlation towards causality in unraveling complex biological insights.
Collapse
Affiliation(s)
- Shivam Kumar
- Complex Analysis Group, Translational Health Science and Technology Institute, NCR Biotech Science Cluster, Faridabad 121001, India
| | - Dipanka Tanu Sarmah
- Complex Analysis Group, Translational Health Science and Technology Institute, NCR Biotech Science Cluster, Faridabad 121001, India
| | - Abhijit Paul
- Complex Analysis Group, Translational Health Science and Technology Institute, NCR Biotech Science Cluster, Faridabad 121001, India
| | - Samrat Chatterjee
- Complex Analysis Group, Translational Health Science and Technology Institute, NCR Biotech Science Cluster, Faridabad 121001, India.
| |
Collapse
|
22
|
Zhao X, Singhal A, Park S, Kong J, Bachelder R, Ideker T. Cancer Mutations Converge on a Collection of Protein Assemblies to Predict Resistance to Replication Stress. Cancer Discov 2024; 14:508-523. [PMID: 38236062 PMCID: PMC10905674 DOI: 10.1158/2159-8290.cd-23-0641] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Revised: 10/25/2023] [Accepted: 12/21/2023] [Indexed: 01/19/2024]
Abstract
Rapid proliferation is a hallmark of cancer associated with sensitivity to therapeutics that cause DNA replication stress (RS). Many tumors exhibit drug resistance, however, via molecular pathways that are incompletely understood. Here, we develop an ensemble of predictive models that elucidate how cancer mutations impact the response to common RS-inducing (RSi) agents. The models implement recent advances in deep learning to facilitate multidrug prediction and mechanistic interpretation. Initial studies in tumor cells identify 41 molecular assemblies that integrate alterations in hundreds of genes for accurate drug response prediction. These cover roles in transcription, repair, cell-cycle checkpoints, and growth signaling, of which 30 are shown by loss-of-function genetic screens to regulate drug sensitivity or replication restart. The model translates to cisplatin-treated cervical cancer patients, highlighting an RTK-JAK-STAT assembly governing resistance. This study defines a compendium of mechanisms by which mutations affect therapeutic responses, with implications for precision medicine. SIGNIFICANCE Zhao and colleagues use recent advances in machine learning to study the effects of tumor mutations on the response to common therapeutics that cause RS. The resulting predictive models integrate numerous genetic alterations distributed across a constellation of molecular assemblies, facilitating a quantitative and interpretable assessment of drug response. This article is featured in Selected Articles from This Issue, p. 384.
Collapse
Affiliation(s)
- Xiaoyu Zhao
- Division of Human Genomics and Precision Medicine, Department of Medicine, University of California, San Diego, La Jolla, California
| | - Akshat Singhal
- Department of Computer Science and Engineering, University of California, San Diego, La Jolla, California
| | - Sungjoon Park
- Division of Human Genomics and Precision Medicine, Department of Medicine, University of California, San Diego, La Jolla, California
| | - JungHo Kong
- Division of Human Genomics and Precision Medicine, Department of Medicine, University of California, San Diego, La Jolla, California
- Moores Cancer Center, School of Medicine, University of California, San Diego, La Jolla, California
| | - Robin Bachelder
- Division of Human Genomics and Precision Medicine, Department of Medicine, University of California, San Diego, La Jolla, California
| | - Trey Ideker
- Division of Human Genomics and Precision Medicine, Department of Medicine, University of California, San Diego, La Jolla, California
- Department of Computer Science and Engineering, University of California, San Diego, La Jolla, California
- Moores Cancer Center, School of Medicine, University of California, San Diego, La Jolla, California
- Department of Bioengineering, University of California, San Diego, La Jolla, California
| |
Collapse
|
23
|
Antić Ž, Lentes J, Bergmann AK. Cytogenetics and genomics in pediatric acute lymphoblastic leukaemia. Best Pract Res Clin Haematol 2023; 36:101511. [PMID: 38092485 DOI: 10.1016/j.beha.2023.101511] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Revised: 07/24/2023] [Accepted: 08/15/2023] [Indexed: 12/18/2023]
Abstract
The last five decades have witnessed significant improvement in diagnostics, treatment and management of children with acute lymphoblastic leukaemia (ALL). These advancements have become possible through progress in our understanding of the genetic and biological background of ALL, resulting in the introduction of risk-adapted treatment and novel therapeutic targets, e.g., tyrosine kinase inhibitors for BCR::ABL1-positive ALL. Further advances in the taxonomy of ALL and the discovery of new genetic biomarkers and therapeutic targets, as well as the introduction of targeted and immunotherapies into the frontline treatment protocols, may improve management and outcome of children with ALL. In this review we describe the current developments in the (cyto)genetic diagnostics and management of children with ALL, and provide an overview of the most important advances in the genetic classification of ALL. Furthermore, we discuss perspectives resulting from the development of new techniques, including artificial intelligence (AI).
Collapse
Affiliation(s)
- Željko Antić
- Department of Human Genetics, Hannover Medical School (MHH), Hannover, Germany
| | - Jana Lentes
- Department of Human Genetics, Hannover Medical School (MHH), Hannover, Germany
| | - Anke K Bergmann
- Department of Human Genetics, Hannover Medical School (MHH), Hannover, Germany.
| |
Collapse
|
24
|
Carrion SA, Michal JJ, Jiang Z. Alternative Transcripts Diversify Genome Function for Phenome Relevance to Health and Diseases. Genes (Basel) 2023; 14:2051. [PMID: 38002994 PMCID: PMC10671453 DOI: 10.3390/genes14112051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Revised: 11/06/2023] [Accepted: 11/07/2023] [Indexed: 11/26/2023] Open
Abstract
Manipulation using alternative exon splicing (AES), alternative transcription start (ATS), and alternative polyadenylation (APA) sites are key to transcript diversity underlying health and disease. All three are pervasive in organisms, present in at least 50% of human protein-coding genes. In fact, ATS and APA site use has the highest impact on protein identity, with their ability to alter which first and last exons are utilized as well as impacting stability and translation efficiency. These RNA variants have been shown to be highly specific, both in tissue type and stage, with demonstrated importance to cell proliferation, differentiation and the transition from fetal to adult cells. While alternative exon splicing has a limited effect on protein identity, its ubiquity highlights the importance of these minor alterations, which can alter other features such as localization. The three processes are also highly interwoven, with overlapping, complementary, and competing factors, RNA polymerase II and its CTD (C-terminal domain) chief among them. Their role in development means dysregulation leads to a wide variety of disorders and cancers, with some forms of disease disproportionately affected by specific mechanisms (AES, ATS, or APA). Challenges associated with the genome-wide profiling of RNA variants and their potential solutions are also discussed in this review.
Collapse
Affiliation(s)
| | | | - Zhihua Jiang
- Department of Animal Sciences and Center for Reproductive Biology, Washington State University, Pullman, WA 99164-7620, USA; (S.A.C.); (J.J.M.)
| |
Collapse
|
25
|
Esser-Skala W, Fortelny N. Reliable interpretability of biology-inspired deep neural networks. NPJ Syst Biol Appl 2023; 9:50. [PMID: 37816807 PMCID: PMC10564878 DOI: 10.1038/s41540-023-00310-8] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Accepted: 09/15/2023] [Indexed: 10/12/2023] Open
Abstract
Deep neural networks display impressive performance but suffer from limited interpretability. Biology-inspired deep learning, where the architecture of the computational graph is based on biological knowledge, enables unique interpretability where real-world concepts are encoded in hidden nodes, which can be ranked by importance and thereby interpreted. In such models trained on single-cell transcriptomes, we previously demonstrated that node-level interpretations lack robustness upon repeated training and are influenced by biases in biological knowledge. Similar studies are missing for related models. Here, we test and extend our methodology for reliable interpretability in P-NET, a biology-inspired model trained on patient mutation data. We observe variability of interpretations and susceptibility to knowledge biases, and identify the network properties that drive interpretation biases. We further present an approach to control the robustness and biases of interpretations, which leads to more specific interpretations. In summary, our study reveals the broad importance of methods to ensure robust and bias-aware interpretability in biology-inspired deep learning.
Collapse
Affiliation(s)
- Wolfgang Esser-Skala
- Computational Systems Biology Group, Department of Biosciences and Medical Biology, University of Salzburg, Hellbrunner Straße 34, 5020, Salzburg, Austria
| | - Nikolaus Fortelny
- Computational Systems Biology Group, Department of Biosciences and Medical Biology, University of Salzburg, Hellbrunner Straße 34, 5020, Salzburg, Austria.
| |
Collapse
|
26
|
Zhang L, Cao L, Li S, Wang L, Song Y, Huang Y, Xu Z, He J, Wang M, Li K. Biologically Interpretable Deep Learning To Predict Response to Immunotherapy In Advanced Melanoma Using Mutations and Copy Number Variations. J Immunother 2023; 46:221-231. [PMID: 37220017 DOI: 10.1097/cji.0000000000000475] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Accepted: 04/13/2023] [Indexed: 05/25/2023]
Abstract
Only 30-40% of advanced melanoma patients respond effectively to immunotherapy in clinical practice, so it is necessary to accurately identify the response of patients to immunotherapy pre-clinically. Here, we develop KP-NET, a deep learning model that is sparse on KEGG pathways, and combine it with transfer- learning to accurately predict the response of advanced melanomas to immunotherapy using KEGG pathway-level information enriched from gene mutation and copy number variation data. The KP-NET demonstrates best performance with AUROC of 0.886 on testing set and 0.803 on an unseen evaluation set when predicting responders (CR/PR/SD with PFS ≥6 mo) versus non-responders (PD/SD with PFS <6 mo) in anti-CTLA-4 treated melanoma patients. The model also achieves an AUROC of 0.917 and 0.833 in predicting CR/PR versus PD, respectively. Meanwhile, the AUROC is 0.913 when predicting responders versus non-responders in anti-PD-1/PD-L1 melanomas. Moreover, the KP-NET reveals some genes and pathways associated with response to anti-CTLA-4 treatment, such as genes PIK3CA, AOX1 and CBLB, and ErbB signaling pathway, T cell receptor signaling pathway, et al. In conclusion, the KP-NET can accurately predict the response of melanomas to immunotherapy and screen related biomarkers pre-clinically, which can contribute to precision medicine of melanoma.
Collapse
Affiliation(s)
- Liuchao Zhang
- Department of Epidemiology and Biostatistics, School of Public Health, Harbin Medical University, Harbin, China
| | | | | | | | | | | | | | | | | | | |
Collapse
|
27
|
Beaude A, Rafiee Vahid M, Augé F, Zehraoui F, Hanczar B. AttOmics: attention-based architecture for diagnosis and prognosis from omics data. Bioinformatics 2023; 39:i94-i102. [PMID: 37387182 DOI: 10.1093/bioinformatics/btad232] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023] Open
Abstract
MOTIVATION The increasing availability of high-throughput omics data allows for considering a new medicine centered on individual patients. Precision medicine relies on exploiting these high-throughput data with machine-learning models, especially the ones based on deep-learning approaches, to improve diagnosis. Due to the high-dimensional small-sample nature of omics data, current deep-learning models end up with many parameters and have to be fitted with a limited training set. Furthermore, interactions between molecular entities inside an omics profile are not patient specific but are the same for all patients. RESULTS In this article, we propose AttOmics, a new deep-learning architecture based on the self-attention mechanism. First, we decompose each omics profile into a set of groups, where each group contains related features. Then, by applying the self-attention mechanism to the set of groups, we can capture the different interactions specific to a patient. The results of different experiments carried out in this article show that our model can accurately predict the phenotype of a patient with fewer parameters than deep neural networks. Visualizing the attention maps can provide new insights into the essential groups for a particular phenotype. AVAILABILITY AND IMPLEMENTATION The code and data are available at https://forge.ibisc.univ-evry.fr/abeaude/AttOmics. TCGA data can be downloaded from the Genomic Data Commons Data Portal.
Collapse
Affiliation(s)
- Aurélien Beaude
- IBISC, Université Paris-Saclay, Univ Evry, 23 Boulevard de France, Evry-Courcouronnes 91020, France
- Artificial Intelligence & Deep Analytics, Omics Data Science, Sanofi R&D Data and Data Science, 1 Av. Pierre Brossolette, Chilly-Mazarin 91385, France
| | - Milad Rafiee Vahid
- Sanofi R&D Data and Data Science, Artificial Intelligence & Deep Analytics, Omics Data Science, 450 Water Street, Cambridge, MA 02142, United States
| | - Franck Augé
- Artificial Intelligence & Deep Analytics, Omics Data Science, Sanofi R&D Data and Data Science, 1 Av. Pierre Brossolette, Chilly-Mazarin 91385, France
| | - Farida Zehraoui
- IBISC, Université Paris-Saclay, Univ Evry, 23 Boulevard de France, Evry-Courcouronnes 91020, France
| | - Blaise Hanczar
- IBISC, Université Paris-Saclay, Univ Evry, 23 Boulevard de France, Evry-Courcouronnes 91020, France
| |
Collapse
|
28
|
Wysocka M, Wysocki O, Zufferey M, Landers D, Freitas A. A systematic review of biologically-informed deep learning models for cancer: fundamental trends for encoding and interpreting oncology data. BMC Bioinformatics 2023; 24:198. [PMID: 37189058 PMCID: PMC10186658 DOI: 10.1186/s12859-023-05262-8] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2022] [Accepted: 03/30/2023] [Indexed: 05/17/2023] Open
Abstract
BACKGROUND There is an increasing interest in the use of Deep Learning (DL) based methods as a supporting analytical framework in oncology. However, most direct applications of DL will deliver models with limited transparency and explainability, which constrain their deployment in biomedical settings. METHODS This systematic review discusses DL models used to support inference in cancer biology with a particular emphasis on multi-omics analysis. It focuses on how existing models address the need for better dialogue with prior knowledge, biological plausibility and interpretability, fundamental properties in the biomedical domain. For this, we retrieved and analyzed 42 studies focusing on emerging architectural and methodological advances, the encoding of biological domain knowledge and the integration of explainability methods. RESULTS We discuss the recent evolutionary arch of DL models in the direction of integrating prior biological relational and network knowledge to support better generalisation (e.g. pathways or Protein-Protein-Interaction networks) and interpretability. This represents a fundamental functional shift towards models which can integrate mechanistic and statistical inference aspects. We introduce a concept of bio-centric interpretability and according to its taxonomy, we discuss representational methodologies for the integration of domain prior knowledge in such models. CONCLUSIONS The paper provides a critical outlook into contemporary methods for explainability and interpretability used in DL for cancer. The analysis points in the direction of a convergence between encoding prior knowledge and improved interpretability. We introduce bio-centric interpretability which is an important step towards formalisation of biological interpretability of DL models and developing methods that are less problem- or application-specific.
Collapse
Affiliation(s)
- Magdalena Wysocka
- Digital Experimental Cancer Medicine Team, Cancer Biomarker Centre, CRUK Manchester Institute, University of Manchester, Oxford Rd, Manchester, M13 9 PL UK
- Department of Computer Science, University of Manchester, Oxford Rd, Manchester, M13 9 PL UK
| | - Oskar Wysocki
- Digital Experimental Cancer Medicine Team, Cancer Biomarker Centre, CRUK Manchester Institute, University of Manchester, Oxford Rd, Manchester, M13 9 PL UK
- Department of Computer Science, University of Manchester, Oxford Rd, Manchester, M13 9 PL UK
- Idiap Research Institute, National University of Sciences, Rue Marconi 19, CH - 1920 Martigny, Switzerland
| | - Marie Zufferey
- Idiap Research Institute, National University of Sciences, Rue Marconi 19, CH - 1920 Martigny, Switzerland
| | - Dónal Landers
- DeLondra Oncology Ltd, 38 Carlton Avenue, Wilmslow, SK9 4EP UK
| | - André Freitas
- Digital Experimental Cancer Medicine Team, Cancer Biomarker Centre, CRUK Manchester Institute, University of Manchester, Oxford Rd, Manchester, M13 9 PL UK
- Department of Computer Science, University of Manchester, Oxford Rd, Manchester, M13 9 PL UK
- Idiap Research Institute, National University of Sciences, Rue Marconi 19, CH - 1920 Martigny, Switzerland
| |
Collapse
|
29
|
Janizek JD, Spiro A, Celik S, Blue BW, Russell JC, Lee TI, Kaeberlin M, Lee SI. PAUSE: principled feature attribution for unsupervised gene expression analysis. Genome Biol 2023; 24:81. [PMID: 37076856 PMCID: PMC10114348 DOI: 10.1186/s13059-023-02901-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2022] [Accepted: 03/17/2023] [Indexed: 04/21/2023] Open
Abstract
As interest in using unsupervised deep learning models to analyze gene expression data has grown, an increasing number of methods have been developed to make these models more interpretable. These methods can be separated into two groups: post hoc analyses of black box models through feature attribution methods and approaches to build inherently interpretable models through biologically-constrained architectures. We argue that these approaches are not mutually exclusive, but can in fact be usefully combined. We propose PAUSE ( https://github.com/suinleelab/PAUSE ), an unsupervised pathway attribution method that identifies major sources of transcriptomic variation when combined with biologically-constrained neural network models.
Collapse
Affiliation(s)
- Joseph D Janizek
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, USA
- Medical Scientist Training Program, University of Washington, Seattle, USA
| | - Anna Spiro
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, USA
| | | | - Ben W Blue
- Department of Pathology, University of Washington, Seattle, USA
| | - John C Russell
- Department of Pathology, University of Washington, Seattle, USA
| | - Ting-I Lee
- Department of Pathology, University of Washington, Seattle, USA
| | - Matt Kaeberlin
- Department of Pathology, University of Washington, Seattle, USA
- Department of Genome Sciences, University of Washington, Seattle, USA
| | - Su-In Lee
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, USA.
| |
Collapse
|
30
|
Huang Y, Rong Z, Zhang L, Xu Z, Ji J, He J, Liu W, Hou Y, Li K. HiRAND: A novel GCN semi-supervised deep learning-based framework for classification and feature selection in drug research and development. Front Oncol 2023; 13:1047556. [PMID: 36776339 PMCID: PMC9909422 DOI: 10.3389/fonc.2023.1047556] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2022] [Accepted: 01/03/2023] [Indexed: 01/28/2023] Open
Abstract
The prediction of response to drugs before initiating therapy based on transcriptome data is a major challenge. However, identifying effective drug response label data costs time and resources. Methods available often predict poorly and fail to identify robust biomarkers due to the curse of dimensionality: high dimensionality and low sample size. Therefore, this necessitates the development of predictive models to effectively predict the response to drugs using limited labeled data while being interpretable. In this study, we report a novel Hierarchical Graph Random Neural Networks (HiRAND) framework to predict the drug response using transcriptome data of few labeled data and additional unlabeled data. HiRAND completes the information integration of the gene graph and sample graph by graph convolutional network (GCN). The innovation of our model is leveraging data augmentation strategy to solve the dilemma of limited labeled data and using consistency regularization to optimize the prediction consistency of unlabeled data across different data augmentations. The results showed that HiRAND achieved better performance than competitive methods in various prediction scenarios, including both simulation data and multiple drug response data. We found that the prediction ability of HiRAND in the drug vorinostat showed the best results across all 62 drugs. In addition, HiRAND was interpreted to identify the key genes most important to vorinostat response, highlighting critical roles for ribosomal protein-related genes in the response to histone deacetylase inhibition. Our HiRAND could be utilized as an efficient framework for improving the drug response prediction performance using few labeled data.
Collapse
Affiliation(s)
- Yue Huang
- Department of Biostatistics, School of Public Health, Harbin Medical University, Harbin, China
| | - Zhiwei Rong
- Department of Biostatistics, School of Public Health, Peking University, Beijing, China
| | - Liuchao Zhang
- Department of Biostatistics, School of Public Health, Harbin Medical University, Harbin, China
| | - Zhenyi Xu
- Department of Biostatistics, School of Public Health, Harbin Medical University, Harbin, China
| | - Jianxin Ji
- Department of Biostatistics, School of Public Health, Harbin Medical University, Harbin, China
| | - Jia He
- Department of Biostatistics, School of Public Health, Harbin Medical University, Harbin, China
| | - Weisha Liu
- Department of Biostatistics, School of Public Health, Harbin Medical University, Harbin, China
| | - Yan Hou
- Department of Biostatistics, School of Public Health, Peking University, Beijing, China,*Correspondence: Kang Li, ; Yan Hou,
| | - Kang Li
- Department of Biostatistics, School of Public Health, Harbin Medical University, Harbin, China,*Correspondence: Kang Li, ; Yan Hou,
| |
Collapse
|
31
|
Prelaj A, Galli EG, Miskovic V, Pesenti M, Viscardi G, Pedica B, Mazzeo L, Bottiglieri A, Provenzano L, Spagnoletti A, Marinacci R, De Toma A, Proto C, Ferrara R, Brambilla M, Occhipinti M, Manglaviti S, Galli G, Signorelli D, Giani C, Beninato T, Pircher CC, Rametta A, Kosta S, Zanitti M, Di Mauro MR, Rinaldi A, Di Gregorio S, Antonia M, Garassino MC, de Braud FGM, Restelli M, Lo Russo G, Ganzinelli M, Trovò F, Pedrocchi ALG. Real-world data to build explainable trustworthy artificial intelligence models for prediction of immunotherapy efficacy in NSCLC patients. Front Oncol 2023; 12:1078822. [PMID: 36755856 PMCID: PMC9899835 DOI: 10.3389/fonc.2022.1078822] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2022] [Accepted: 12/14/2022] [Indexed: 01/24/2023] Open
Abstract
Introduction Artificial Intelligence (AI) methods are being increasingly investigated as a means to generate predictive models applicable in the clinical practice. In this study, we developed a model to predict the efficacy of immunotherapy (IO) in patients with advanced non-small cell lung cancer (NSCLC) using eXplainable AI (XAI) Machine Learning (ML) methods. Methods We prospectively collected real-world data from patients with an advanced NSCLC condition receiving immune-checkpoint inhibitors (ICIs) either as a single agent or in combination with chemotherapy. With regards to six different outcomes - Disease Control Rate (DCR), Objective Response Rate (ORR), 6 and 24-month Overall Survival (OS6 and OS24), 3-months Progression-Free Survival (PFS3) and Time to Treatment Failure (TTF3) - we evaluated five different classification ML models: CatBoost (CB), Logistic Regression (LR), Neural Network (NN), Random Forest (RF) and Support Vector Machine (SVM). We used the Shapley Additive Explanation (SHAP) values to explain model predictions. Results Of 480 patients included in the study 407 received immunotherapy and 73 chemo- and immunotherapy. From all the ML models, CB performed the best for OS6 and TTF3, (accuracy 0.83 and 0.81, respectively). CB and LR reached accuracy of 0.75 and 0.73 for the outcome DCR. SHAP for CB demonstrated that the feature that strongly influences models' prediction for all three outcomes was Neutrophil to Lymphocyte Ratio (NLR). Performance Status (ECOG-PS) was an important feature for the outcomes OS6 and TTF3, while PD-L1, Line of IO and chemo-immunotherapy appeared to be more important in predicting DCR. Conclusions In this study we developed a ML algorithm based on real-world data, explained by SHAP techniques, and able to accurately predict the efficacy of immunotherapy in sets of NSCLC patients.
Collapse
Affiliation(s)
- Arsela Prelaj
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy,Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, Italy,*Correspondence: Arsela Prelaj,
| | - Edoardo Gregorio Galli
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy,Niguarda Cancer Center, Grande Ospedale Metropolitano Niguarda, Milan, Italy,Oncology Department, University of Milan, Milan, Italy
| | - Vanja Miskovic
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, Italy
| | - Mattia Pesenti
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, Italy
| | - Giuseppe Viscardi
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy,Medical Oncology Unit, Department of Precision Medicine, University of Campania “Luigi Vanvitelli”, Naples, Italy
| | - Benedetta Pedica
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, Italy
| | - Laura Mazzeo
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy,Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, Italy,Oncology Department, University of Milan, Milan, Italy
| | - Achille Bottiglieri
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy,Oncology Department, University of Milan, Milan, Italy
| | - Leonardo Provenzano
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy,Oncology Department, University of Milan, Milan, Italy
| | - Andrea Spagnoletti
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy,Oncology Department, University of Milan, Milan, Italy
| | - Roberto Marinacci
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, Italy
| | - Alessandro De Toma
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy
| | - Claudia Proto
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy
| | - Roberto Ferrara
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy
| | - Marta Brambilla
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy
| | - Mario Occhipinti
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy
| | - Sara Manglaviti
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy
| | - Giulia Galli
- Medical Oncology Unit, Policlinico San Matteo Fondazione IRCCS, Pavia, Italy
| | - Diego Signorelli
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy,Niguarda Cancer Center, Grande Ospedale Metropolitano Niguarda, Milan, Italy
| | - Claudia Giani
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy,Oncology Department, University of Milan, Milan, Italy
| | - Teresa Beninato
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy,Oncology Department, University of Milan, Milan, Italy
| | - Chiara Carlotta Pircher
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy,Oncology Department, University of Milan, Milan, Italy
| | - Alessandro Rametta
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy,Oncology Department, University of Milan, Milan, Italy
| | - Sokol Kosta
- Department of Electronic System, Aalborg University, Copenhagen, Aalborg, Denmark
| | - Michele Zanitti
- Department of Electronic System, Aalborg University, Copenhagen, Aalborg, Denmark
| | - Maria Rosa Di Mauro
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy
| | - Arturo Rinaldi
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy
| | - Settimio Di Gregorio
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy
| | - Martinetti Antonia
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy
| | - Marina Chiara Garassino
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy,Thoracic Oncology Program, Section of Hematology/Oncology, University of Chicago, Chicago, IL, United States
| | - Filippo G. M. de Braud
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy,Oncology Department, University of Milan, Milan, Italy
| | - Marcello Restelli
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, Italy
| | - Giuseppe Lo Russo
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy
| | - Monica Ganzinelli
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy
| | - Francesco Trovò
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, Italy
| | | |
Collapse
|
32
|
Assessing Metabolic Markers in Glioblastoma Using Machine Learning: A Systematic Review. Metabolites 2023; 13:metabo13020161. [PMID: 36837779 PMCID: PMC9958885 DOI: 10.3390/metabo13020161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2022] [Revised: 01/14/2023] [Accepted: 01/18/2023] [Indexed: 01/24/2023] Open
Abstract
Glioblastoma (GBM) is a common and deadly brain tumor with late diagnoses and poor prognoses. Machine learning (ML) is an emerging tool that can create highly accurate diagnostic and prognostic prediction models. This paper aimed to systematically search the literature on ML for GBM metabolism and assess recent advancements. A literature search was performed using predetermined search terms. Articles describing the use of an ML algorithm for GBM metabolism were included. Ten studies met the inclusion criteria for analysis: diagnostic (n = 3, 30%), prognostic (n = 6, 60%), or both (n = 1, 10%). Most studies analyzed data from multiple databases, while 50% (n = 5) included additional original samples. At least 2536 data samples were run through an ML algorithm. Twenty-seven ML algorithms were recorded with a mean of 2.8 algorithms per study. Algorithms were supervised (n = 24, 89%), unsupervised (n = 3, 11%), continuous (n = 19, 70%), or categorical (n = 8, 30%). The mean reported accuracy and AUC of ROC were 95.63% and 0.779, respectively. One hundred six metabolic markers were identified, but only EMP3 was reported in multiple studies. Many studies have identified potential biomarkers for GBM diagnosis and prognostication. These algorithms show promise; however, a consensus on even a handful of biomarkers has not yet been made.
Collapse
|
33
|
Huang J, Zhao C, Zhang X, Zhao Q, Zhang Y, Chen L, Dai G. Hepatitis B virus pathogenesis relevant immunosignals uncovering amino acids utilization related risk factors guide artificial intelligence-based precision medicine. Front Pharmacol 2022; 13:1079566. [PMID: 36569318 PMCID: PMC9780394 DOI: 10.3389/fphar.2022.1079566] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Accepted: 11/30/2022] [Indexed: 12/14/2022] Open
Abstract
Background: Although immune microenvironment-related chemokines, extracellular matrix (ECM), and intrahepatic immune cells are reported to be highly involved in hepatitis B virus (HBV)-related diseases, their roles in diagnosis, prognosis, and drug sensitivity evaluation remain unclear. Here, we aimed to study their clinical use to provide a basis for precision medicine in hepatocellular carcinoma (HCC) via the amalgamation of artificial intelligence. Methods: High-throughput liver transcriptomes from Gene Expression Omnibus (GEO), NODE (https://www.bio.sino.org/node), the Cancer Genome Atlas (TCGA), and our in-house hepatocellular carcinoma patients were collected in this study. Core immunosignals that participated in the entire diseases course of hepatitis B were explored using the "Gene set variation analysis" R package. Using ROC curve analysis, the impact of core immunosignals and amino acid utilization related gene on hepatocellular carcinoma patient's clinical outcome were calculated. The utility of core immunosignals as a classifier for hepatocellular carcinoma tumor tissue was evaluated using explainable machine-learning methods. A novel deep residual neural network model based on immunosignals was constructed for the long-term overall survival (LS) analysis. In vivo drug sensitivity was calculated by the "oncoPredict" R package. Results: We identified nine genes comprising chemokines and ECM related to hepatitis B virus-induced inflammation and fibrosis as CLST signals. Moreover, CLST was co-enriched with activated CD4+ T cells bearing harmful factors (aCD4) during all stages of hepatitis B virus pathogenesis, which was also verified by our hepatocellular carcinoma data. Unexpectedly, we found that hepatitis B virus-hepatocellular carcinoma patients in the CLSThighaCD4high subgroup had the shortest overall survival (OS) and were characterized by a risk gene signature associated with amino acids utilization. Importantly, characteristic genes specific to CLST/aCD4 showed promising clinical relevance in identifying patients with early-stage hepatocellular carcinoma via explainable machine learning. In addition, the 5-year long-term overall survival of hepatocellular carcinoma patients can be effectively classified by CLST/aCD4 based GeneSet-ResNet model. Subgroups defined by CLST and aCD4 were significantly involved in the sensitivity of hepatitis B virus-hepatocellular carcinoma patients to chemotherapy treatments. Conclusion: CLST and aCD4 are hepatitis B virus pathogenesis-relevant immunosignals that are highly involved in hepatitis B virus-induced inflammation, fibrosis, and hepatocellular carcinoma. Gene set variation analysis derived immunogenomic signatures enabled efficient diagnostic and prognostic model construction. The clinical application of CLST and aCD4 as indicators would be beneficial for the precision management of hepatocellular carcinoma.
Collapse
Affiliation(s)
- Jun Huang
- School of Life Sciences, Zhengzhou University, Zhengzhou, Henan, China
| | - Chunbei Zhao
- School of Life Sciences, Zhengzhou University, Zhengzhou, Henan, China
| | - Xinhe Zhang
- School of Life Sciences, Zhengzhou University, Zhengzhou, Henan, China
| | - Qiaohui Zhao
- School of Life Sciences, Zhengzhou University, Zhengzhou, Henan, China
| | - Yanting Zhang
- School of Life Sciences, Zhengzhou University, Zhengzhou, Henan, China
| | - Liping Chen
- Key Laboratory of Gastroenterology and Hepatology, State Key Laboratory for Oncogenes and Related Genes, Department of Gastroenterology and Hepatology, Ministry of Health, Shanghai Institute of Digestive Disease, Renji Hospital, School of Medicine, Shanghai Jiaotong University, Shanghai, China
- Shanghai Public Health Clinical Center, Fudan University, Shanghai, China
| | - Guifu Dai
- School of Life Sciences, Zhengzhou University, Zhengzhou, Henan, China
| |
Collapse
|
34
|
Ding W, Abdel-Basset M, Hawash H, Ali AM. Explainability of artificial intelligence methods, applications and challenges: A comprehensive survey. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2022.10.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
35
|
Liang B, Gong H, Lu L, Xu J. Risk stratification and pathway analysis based on graph neural network and interpretable algorithm. BMC Bioinformatics 2022; 23:394. [PMID: 36167504 PMCID: PMC9516820 DOI: 10.1186/s12859-022-04950-1] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2022] [Accepted: 09/19/2022] [Indexed: 12/01/2022] Open
Abstract
Background Pathway-based analysis of transcriptomic data has shown greater stability and better performance than traditional gene-based analysis. Until now, some pathway-based deep learning models have been developed for bioinformatic analysis, but these models have not fully considered the topological features of pathways, which limits the performance of the final prediction result. Results To address this issue, we propose a novel model, called PathGNN, which constructs a Graph Neural Networks (GNNs) model that can capture topological features of pathways. As a case, PathGNN was applied to predict long-term survival of four types of cancer and achieved promising predictive performance when compared to other common methods. Furthermore, the adoption of an interpretation algorithm enabled the identification of plausible pathways associated with survival. Conclusion PathGNN demonstrates that GNN can be effectively applied to build a pathway-based model, resulting in promising predictive power. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-04950-1.
Collapse
Affiliation(s)
- Bilin Liang
- Shanghai Artificial Intelligence Laboratory, Yunjing Road 701, Shanghai, China
| | - Haifan Gong
- Shanghai Artificial Intelligence Laboratory, Yunjing Road 701, Shanghai, China
| | - Lu Lu
- Shanghai Artificial Intelligence Laboratory, Yunjing Road 701, Shanghai, China
| | - Jie Xu
- Shanghai Artificial Intelligence Laboratory, Yunjing Road 701, Shanghai, China.
| |
Collapse
|
36
|
A Deep Neural Network for Gastric Cancer Prognosis Prediction Based on Biological Information Pathways. JOURNAL OF ONCOLOGY 2022; 2022:2965166. [PMID: 36117847 PMCID: PMC9481367 DOI: 10.1155/2022/2965166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Revised: 07/09/2022] [Accepted: 07/22/2022] [Indexed: 11/18/2022]
Abstract
Background Gastric cancer (GC) is one of the deadliest cancers in the world, with a 5-year overall survival rate of lower than 20% for patients with advanced GC. Genomic information is now frequently employed for precision cancer treatment due to the rapid advancements of high-throughput sequencing technologies. As a result, integrating multiomics data to construct predictive models for the GC patient prognosis is critical for tailored medical care. Results In this study, we integrated multiomics data to design a biological pathway-based gastric cancer sparse deep neural network (GCS-Net) by modifying the P-NET model for long-term survival prediction of GC. The GCS-Net showed higher accuracy (accuracy = 0.844), area under the curve (AUC = 0.807), and F1 score (F1 = 0.913) than traditional machine learning models. Furthermore, the GCS-Net not only enables accurate patient survival prognosis but also provides model interpretability capabilities lacking in most traditional deep neural networks to describe the complex biological process of prognosis. The GCS-Net suggested the importance of genes (UBE2C, JAK2, RAD21, CEP250, NUP210, PTPN1, CDC27, NINL, NUP188, and PLK4) and biological pathways (Mitotic Anaphase, Resolution of Sister Chromatid Cohesion, and SUMO E3 ligases) to GC, which is consistent with the results revealed in biological- and medical-related studies of GC. Conclusion The GCS-Net is an interpretable deep neural network built using biological pathway information whose structure represents a nonlinear hierarchical representation of genes and biological pathways. It can not only accurately predict the prognosis of GC patients but also suggest the importance of genes and biological pathways. The GCS-Net opens up new avenues for biological research and could be adapted for other cancer prediction and discovery activities as well.
Collapse
|
37
|
Park C, Kim B, Park T. DeepHisCoM: deep learning pathway analysis using hierarchical structural component models. Brief Bioinform 2022; 23:6590446. [DOI: 10.1093/bib/bbac171] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2021] [Revised: 04/04/2022] [Accepted: 04/18/2022] [Indexed: 11/13/2022] Open
Abstract
Abstract
Many statistical methods for pathway analysis have been used to identify pathways associated with the disease along with biological factors such as genes and proteins. However, most pathway analysis methods neglect the complex nonlinear relationship between biological factors and pathways. In this study, we propose a Deep-learning pathway analysis using Hierarchical structured CoMponent models (DeepHisCoM) that utilize deep learning to consider a nonlinear complex contribution of biological factors to pathways by constructing a multilayered model which accounts for hierarchical biological structure. Through simulation studies, DeepHisCoM was shown to have a higher power in the nonlinear pathway effect and comparable power for the linear pathway effect when compared to the conventional pathway methods. Application to hepatocellular carcinoma (HCC) omics datasets, including metabolomic, transcriptomic and metagenomic datasets, demonstrated that DeepHisCoM successfully identified three well-known pathways that are highly associated with HCC, such as lysine degradation, valine, leucine and isoleucine biosynthesis and phenylalanine, tyrosine and tryptophan. Application to the coronavirus disease-2019 (COVID-19) single-nucleotide polymorphism (SNP) dataset also showed that DeepHisCoM identified four pathways that are highly associated with the severity of COVID-19, such as mitogen-activated protein kinase (MAPK) signaling pathway, gonadotropin-releasing hormone (GnRH) signaling pathway, hypertrophic cardiomyopathy and dilated cardiomyopathy. Codes are available at https://github.com/chanwoo-park-official/DeepHisCoM.
Collapse
Affiliation(s)
- Chanwoo Park
- Department of Statistics, Seoul National University, Seoul 08826, Korea
| | - Boram Kim
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 08826, Korea
| | - Taesung Park
- Department of Statistics, Seoul National University, Seoul 08826, Korea
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 08826, Korea
| |
Collapse
|
38
|
Li W, Sun T, Li M, He Y, Li L, Wang L, Wang H, Li J, Wen H, Liu Y, Chen Y, Fan Y, Xin B, Zhang J. GNIFdb: a neoantigen intrinsic feature database for glioma. Database (Oxford) 2022; 2022:6527499. [PMID: 35150127 PMCID: PMC9216533 DOI: 10.1093/database/baac004] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2021] [Revised: 01/06/2022] [Accepted: 01/29/2022] [Indexed: 12/24/2022]
Abstract
ABSTRACT Neoantigens are mutation-containing immunogenic peptides from tumor cells. Neoantigen intrinsic features are neoantigens' sequence-associated features characterized by different amino acid descriptors and physical-chemical properties, which have a crucial function in prioritization of neoantigens with immunogenic potentials and predicting patients with better survival. Different intrinsic features might have functions to varying degrees in evaluating neoantigens' potentials of immunogenicity. Identification and comparison of intrinsic features among neoantigens are particularly important for developing neoantigen-based personalized immunotherapy. However, there is still no public repository to host the intrinsic features of neoantigens. Therefore, we developed GNIFdb, a glioma neoantigen intrinsic feature database specifically designed for hosting, exploring and visualizing neoantigen and intrinsic features. The database provides a comprehensive repository of computationally predicted Human leukocyte antigen class I (HLA-I) restricted neoantigens and their intrinsic features; a systematic annotation of neoantigens including sequence, neoantigen-associated mutation, gene expression, glioma prognosis, HLA-I subtype and binding affinity between neoantigens and HLA-I; and a genome browser to visualize them in an interactive manner. It represents a valuable resource for the neoantigen research community and is publicly available at http://www.oncoimmunobank.cn/index.php. DATABASE URL http://www.oncoimmunobank.cn/index.php.
Collapse
Affiliation(s)
- Wendong Li
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing 100083, P. R. China
| | - Ting Sun
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing 100083, P. R. China
| | - Muyang Li
- Department of Plant Genetics and Breeding, State Key Laboratory of Plant Physiology and Biochemistry & National Maize Improvement Center, China Agricultural University, No.17 Qinghua East Road, Haidian District, Beijing 100193, P. R. China
| | - Yufei He
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing 100083, P. R. China
| | - Lin Li
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing 100083, P. R. China
| | - Lu Wang
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing 100083, P. R. China
| | - Haoyu Wang
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing 100083, P. R. China
| | - Jing Li
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing 100083, P. R. China
| | - Hao Wen
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing 100083, P. R. China
| | - Yong Liu
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing 100083, P. R. China
| | - Yifan Chen
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing 100083, P. R. China
| | - Yubo Fan
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing 100083, P. R. China
| | - Beibei Xin
- Department of Plant Genetics and Breeding, State Key Laboratory of Plant Physiology and Biochemistry & National Maize Improvement Center, China Agricultural University, No.17 Qinghua East Road, Haidian District, Beijing 100193, P. R. China
| | - Jing Zhang
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing 100083, P. R. China
| |
Collapse
|
39
|
Kang M, Ko E, Mersha TB. A roadmap for multi-omics data integration using deep learning. Brief Bioinform 2022; 23:bbab454. [PMID: 34791014 PMCID: PMC8769688 DOI: 10.1093/bib/bbab454] [Citation(s) in RCA: 138] [Impact Index Per Article: 46.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2021] [Revised: 09/30/2021] [Accepted: 10/05/2021] [Indexed: 12/18/2022] Open
Abstract
High-throughput next-generation sequencing now makes it possible to generate a vast amount of multi-omics data for various applications. These data have revolutionized biomedical research by providing a more comprehensive understanding of the biological systems and molecular mechanisms of disease development. Recently, deep learning (DL) algorithms have become one of the most promising methods in multi-omics data analysis, due to their predictive performance and capability of capturing nonlinear and hierarchical features. While integrating and translating multi-omics data into useful functional insights remain the biggest bottleneck, there is a clear trend towards incorporating multi-omics analysis in biomedical research to help explain the complex relationships between molecular layers. Multi-omics data have a role to improve prevention, early detection and prediction; monitor progression; interpret patterns and endotyping; and design personalized treatments. In this review, we outline a roadmap of multi-omics integration using DL and offer a practical perspective into the advantages, challenges and barriers to the implementation of DL in multi-omics data.
Collapse
Affiliation(s)
- Mingon Kang
- Department of Computer Science at the University of Nevada, Las Vegas, NV, USA
| | - Euiseong Ko
- Department of Computer Science at the University of Nevada, Las Vegas, NV, USA
| | - Tesfaye B Mersha
- Department of Pediatrics, Cincinnati Children’s Hospital Medical Center, University of Cincinnati, Cincinnati, OH, USA
| |
Collapse
|
40
|
Gundogdu P, Loucera C, Alamo-Alvarez I, Dopazo J, Nepomuceno I. Integrating pathway knowledge with deep neural networks to reduce the dimensionality in single-cell RNA-seq data. BioData Min 2022; 15:1. [PMID: 34980200 PMCID: PMC8722116 DOI: 10.1186/s13040-021-00285-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Accepted: 12/04/2021] [Indexed: 11/13/2022] Open
Abstract
Background Single-cell RNA sequencing (scRNA-seq) data provide valuable insights into cellular heterogeneity which is significantly improving the current knowledge on biology and human disease. One of the main applications of scRNA-seq data analysis is the identification of new cell types and cell states. Deep neural networks (DNNs) are among the best methods to address this problem. However, this performance comes with the trade-off for a lack of interpretability in the results. In this work we propose an intelligible pathway-driven neural network to correctly solve cell-type related problems at single-cell resolution while providing a biologically meaningful representation of the data. Results In this study, we explored the deep neural networks constrained by several types of prior biological information, e.g. signaling pathway information, as a way to reduce the dimensionality of the scRNA-seq data. We have tested the proposed biologically-based architectures on thousands of cells of human and mouse origin across a collection of public datasets in order to check the performance of the model. Specifically, we tested the architecture across different validation scenarios that try to mimic how unknown cell types are clustered by the DNN and how it correctly annotates cell types by querying a database in a retrieval problem. Moreover, our approach demonstrated to be comparable to other less interpretable DNN approaches constrained by using protein-protein interactions gene regulation data. Finally, we show how the latent structure learned by the network could be used to visualize and to interpret the composition of human single cell datasets. Conclusions Here we demonstrate how the integration of pathways, which convey fundamental information on functional relationships between genes, with DNNs, that provide an excellent classification framework, results in an excellent alternative to learn a biologically meaningful representation of scRNA-seq data. In addition, the introduction of prior biological knowledge in the DNN reduces the size of the network architecture. Comparative results demonstrate a superior performance of this approach with respect to other similar approaches. As an additional advantage, the use of pathways within the DNN structure enables easy interpretability of the results by connecting features to cell functionalities by means of the pathway nodes, as demonstrated with an example with human melanoma tumor cells. Supplementary Information The online version contains supplementary material available at 10.1186/s13040-021-00285-4.
Collapse
Affiliation(s)
- Pelin Gundogdu
- Clinical Bioinformatics Area. Fundación Progreso y Salud (FPS). CDCA, Hospital Virgen del Rocio, 41013, Sevilla, Spain
| | - Carlos Loucera
- Clinical Bioinformatics Area. Fundación Progreso y Salud (FPS). CDCA, Hospital Virgen del Rocio, 41013, Sevilla, Spain.,Computational Systems Medicine, Institute of Biomedicine of Seville (IBIS), Hospital Virgen del Rocio, 41013, Sevilla, Spain
| | - Inmaculada Alamo-Alvarez
- Clinical Bioinformatics Area. Fundación Progreso y Salud (FPS). CDCA, Hospital Virgen del Rocio, 41013, Sevilla, Spain.,Computational Systems Medicine, Institute of Biomedicine of Seville (IBIS), Hospital Virgen del Rocio, 41013, Sevilla, Spain
| | - Joaquin Dopazo
- Clinical Bioinformatics Area. Fundación Progreso y Salud (FPS). CDCA, Hospital Virgen del Rocio, 41013, Sevilla, Spain. .,Computational Systems Medicine, Institute of Biomedicine of Seville (IBIS), Hospital Virgen del Rocio, 41013, Sevilla, Spain. .,Bioinformatics in Rare Diseases (BiER), Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), FPS, Hospital Virgen del Rocío, 41013, Sevilla, Spain. .,FPS/ELIXIR-es, Hospital Virgen del Rocío, 42013, Sevilla, Spain.
| | - Isabel Nepomuceno
- Department of Computer Languages and Systems, Universidad de Sevilla, Sevilla, Spain.
| |
Collapse
|
41
|
Yang G, Ye Q, Xia J. Unbox the black-box for the medical explainable AI via multi-modal and multi-centre data fusion: A mini-review, two showcases and beyond. AN INTERNATIONAL JOURNAL ON INFORMATION FUSION 2022; 77:29-52. [PMID: 34980946 PMCID: PMC8459787 DOI: 10.1016/j.inffus.2021.07.016] [Citation(s) in RCA: 181] [Impact Index Per Article: 60.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Revised: 05/25/2021] [Accepted: 07/25/2021] [Indexed: 05/04/2023]
Abstract
Explainable Artificial Intelligence (XAI) is an emerging research topic of machine learning aimed at unboxing how AI systems' black-box choices are made. This research field inspects the measures and models involved in decision-making and seeks solutions to explain them explicitly. Many of the machine learning algorithms cannot manifest how and why a decision has been cast. This is particularly true of the most popular deep neural network approaches currently in use. Consequently, our confidence in AI systems can be hindered by the lack of explainability in these black-box models. The XAI becomes more and more crucial for deep learning powered applications, especially for medical and healthcare studies, although in general these deep neural networks can return an arresting dividend in performance. The insufficient explainability and transparency in most existing AI systems can be one of the major reasons that successful implementation and integration of AI tools into routine clinical practice are uncommon. In this study, we first surveyed the current progress of XAI and in particular its advances in healthcare applications. We then introduced our solutions for XAI leveraging multi-modal and multi-centre data fusion, and subsequently validated in two showcases following real clinical scenarios. Comprehensive quantitative and qualitative analyses can prove the efficacy of our proposed XAI solutions, from which we can envisage successful applications in a broader range of clinical questions.
Collapse
Affiliation(s)
- Guang Yang
- National Heart and Lung Institute, Imperial College London, London, UK
- Royal Brompton Hospital, London, UK
- Imperial Institute of Advanced Technology, Hangzhou, China
| | - Qinghao Ye
- Hangzhou Ocean’s Smart Boya Co., Ltd, China
- University of California, San Diego, La Jolla, CA, USA
| | - Jun Xia
- Radiology Department, Shenzhen Second People’s Hospital, Shenzhen, China
| |
Collapse
|
42
|
Scherer P, Trębacz M, Simidjievski N, Viñas R, Shams Z, Terre HA, Jamnik M, Liò P. Unsupervised construction of computational graphs for gene expression data with explicit structural inductive biases. Bioinformatics 2021; 38:1320-1327. [PMID: 34888618 PMCID: PMC8826027 DOI: 10.1093/bioinformatics/btab830] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Revised: 09/29/2021] [Accepted: 12/03/2021] [Indexed: 01/05/2023] Open
Abstract
MOTIVATION Gene expression data are commonly used at the intersection of cancer research and machine learning for better understanding of the molecular status of tumour tissue. Deep learning predictive models have been employed for gene expression data due to their ability to scale and remove the need for manual feature engineering. However, gene expression data are often very high dimensional, noisy and presented with a low number of samples. This poses significant problems for learning algorithms: models often overfit, learn noise and struggle to capture biologically relevant information. In this article, we utilize external biological knowledge embedded within structures of gene interaction graphs such as protein-protein interaction (PPI) networks to guide the construction of predictive models. RESULTS We present Gene Interaction Network Constrained Construction (GINCCo), an unsupervised method for automated construction of computational graph models for gene expression data that are structurally constrained by prior knowledge of gene interaction networks. We employ this methodology in a case study on incorporating a PPI network in cancer phenotype prediction tasks. Our computational graphs are structurally constructed using topological clustering algorithms on the PPI networks which incorporate inductive biases stemming from network biology research on protein complex discovery. Each of the entities in the GINCCo computational graph represents biological entities such as genes, candidate protein complexes and phenotypes instead of arbitrary hidden nodes of a neural network. This provides a biologically relevant mechanism for model regularization yielding strong predictive performance while drastically reducing the number of model parameters and enabling guided post-hoc enrichment analyses of influential gene sets with respect to target phenotypes. Our experiments analysing a variety of cancer phenotypes show that GINCCo often outperforms support vector machine, Fully Connected Multi-layer Perceptrons (MLP) and Randomly Connected MLPs despite greatly reduced model complexity. AVAILABILITY AND IMPLEMENTATION https://github.com/paulmorio/gincco contains the source code for our approach. We also release a library with algorithms for protein complex discovery within PPI networks at https://github.com/paulmorio/protclus. This repository contains implementations of the clustering algorithms used in this article. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Paul Scherer
- Department of Computer Science and Technology, University of Cambridge, Cambridge, CB3 0FD, UK,To whom correspondence should be addressed.
| | - Maja Trębacz
- Department of Computer Science and Technology, University of Cambridge, Cambridge, CB3 0FD, UK
| | - Nikola Simidjievski
- Department of Computer Science and Technology, University of Cambridge, Cambridge, CB3 0FD, UK
| | - Ramon Viñas
- Department of Computer Science and Technology, University of Cambridge, Cambridge, CB3 0FD, UK
| | - Zohreh Shams
- Department of Computer Science and Technology, University of Cambridge, Cambridge, CB3 0FD, UK
| | - Helena Andres Terre
- Department of Computer Science and Technology, University of Cambridge, Cambridge, CB3 0FD, UK
| | - Mateja Jamnik
- Department of Computer Science and Technology, University of Cambridge, Cambridge, CB3 0FD, UK
| | - Pietro Liò
- Department of Computer Science and Technology, University of Cambridge, Cambridge, CB3 0FD, UK
| |
Collapse
|
43
|
Deep Learning in Cancer Diagnosis and Prognosis Prediction: A Minireview on Challenges, Recent Trends, and Future Directions. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2021; 2021:9025470. [PMID: 34754327 PMCID: PMC8572604 DOI: 10.1155/2021/9025470] [Citation(s) in RCA: 45] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/20/2021] [Revised: 09/30/2021] [Accepted: 10/05/2021] [Indexed: 12/30/2022]
Abstract
Deep learning (DL) is a branch of machine learning and artificial intelligence that has been applied to many areas in different domains such as health care and drug design. Cancer prognosis estimates the ultimate fate of a cancer subject and provides survival estimation of the subjects. An accurate and timely diagnostic and prognostic decision will greatly benefit cancer subjects. DL has emerged as a technology of choice due to the availability of high computational resources. The main components in a standard computer-aided design (CAD) system are preprocessing, feature recognition, extraction and selection, categorization, and performance assessment. Reduction of costs associated with sequencing systems offers a myriad of opportunities for building precise models for cancer diagnosis and prognosis prediction. In this survey, we provided a summary of current works where DL has helped to determine the best models for the cancer diagnosis and prognosis prediction tasks. DL is a generic model requiring minimal data manipulations and achieves better results while working with enormous volumes of data. Aims are to scrutinize the influence of DL systems using histopathology images, present a summary of state-of-the-art DL methods, and give directions to future researchers to refine the existing methods.
Collapse
|
44
|
Classification and Functional Analysis between Cancer and Normal Tissues Using Explainable Pathway Deep Learning through RNA-Sequencing Gene Expression. Int J Mol Sci 2021; 22:ijms222111531. [PMID: 34768960 PMCID: PMC8584109 DOI: 10.3390/ijms222111531] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2021] [Revised: 10/21/2021] [Accepted: 10/21/2021] [Indexed: 11/24/2022] Open
Abstract
Deep learning has proven advantageous in solving cancer diagnostic or classification problems. However, it cannot explain the rationale behind human decisions. Biological pathway databases provide well-studied relationships between genes and their pathways. As pathways comprise knowledge frameworks widely used by human researchers, representing gene-to-pathway relationships in deep learning structures may aid in their comprehension. Here, we propose a deep neural network (PathDeep), which implements gene-to-pathway relationships in its structure. We also provide an application framework measuring the contribution of pathways and genes in deep neural networks in a classification problem. We applied PathDeep to classify cancer and normal tissues based on the publicly available, large gene expression dataset. PathDeep showed higher accuracy than fully connected neural networks in distinguishing cancer from normal tissues (accuracy = 0.994) in 32 tissue samples. We identified 42 pathways related to 32 cancer tissues and 57 associated genes contributing highly to the biological functions of cancer. The most significant pathway was G-protein-coupled receptor signaling, and the most enriched function was the G1/S transition of the mitotic cell cycle, suggesting that these biological functions were the most common cancer characteristics in the 32 tissues.
Collapse
|
45
|
Tang B, Chen Y, Wang Y, Nie J. A Wavelet-Based Learning Model Enhances Molecular Prognosis in Pancreatic Adenocarcinoma. BIOMED RESEARCH INTERNATIONAL 2021; 2021:7865856. [PMID: 34697591 PMCID: PMC8541860 DOI: 10.1155/2021/7865856] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/16/2021] [Accepted: 09/21/2021] [Indexed: 12/24/2022]
Abstract
Genome-wide omics technology boosts deep interrogation into the clinical prognosis and inherent mechanism of pancreatic oncology. Classic LASSO methods coequally treat all candidates, ignoring individual characteristics, thus frequently deteriorating performance with comparatively more predictors. Here, we propose a wavelet-based deep learning method in variable selection and prognosis formulation for PAAD with small samples and multisource information. With the genomic, epigenomic, and clinical cohort information from The Cancer Genome Atlas, the constructed five-molecule model is validated via Kaplan-Meier survival estimate, rendering significant prognosis capability on high- and low-risk subcohorts (p value < 0.0001), together with three predictors manifesting the individual prognosis significance (p value: 0.0012~0.024). Moreover, the performance of the prognosis model has been benchmarked against the traditional LASSO and wavelet-based methods in the 3- and 5-year prediction AUC items, respectively. Specifically, the proposed model with discrete stationary wavelet base (bior1.5) overwhelmingly outperformed traditional LASSO and wavelet-based methods (AUC: 0.787 vs. 0.782 and 0.721 for the 3-year case; AUC: 0.937 vs. 0.802 and 0.859 for the 5-year case). Thus, the proposed model provides a more accurate perspective, but with less predictor burden for clinical prognosis in the pancreatic carcinoma study.
Collapse
Affiliation(s)
- Binhua Tang
- Epigenetics & Function Group, Hohai University, Jiangsu 213022, China
| | - Yu Chen
- Epigenetics & Function Group, Hohai University, Jiangsu 213022, China
| | - Yuqi Wang
- Epigenetics & Function Group, Hohai University, Jiangsu 213022, China
| | - Jiafei Nie
- Epigenetics & Function Group, Hohai University, Jiangsu 213022, China
| |
Collapse
|
46
|
Tran KA, Kondrashova O, Bradley A, Williams ED, Pearson JV, Waddell N. Deep learning in cancer diagnosis, prognosis and treatment selection. Genome Med 2021; 13:152. [PMID: 34579788 PMCID: PMC8477474 DOI: 10.1186/s13073-021-00968-x] [Citation(s) in RCA: 363] [Impact Index Per Article: 90.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2020] [Accepted: 09/12/2021] [Indexed: 12/13/2022] Open
Abstract
Deep learning is a subdiscipline of artificial intelligence that uses a machine learning technique called artificial neural networks to extract patterns and make predictions from large data sets. The increasing adoption of deep learning across healthcare domains together with the availability of highly characterised cancer datasets has accelerated research into the utility of deep learning in the analysis of the complex biology of cancer. While early results are promising, this is a rapidly evolving field with new knowledge emerging in both cancer biology and deep learning. In this review, we provide an overview of emerging deep learning techniques and how they are being applied to oncology. We focus on the deep learning applications for omics data types, including genomic, methylation and transcriptomic data, as well as histopathology-based genomic inference, and provide perspectives on how the different data types can be integrated to develop decision support tools. We provide specific examples of how deep learning may be applied in cancer diagnosis, prognosis and treatment management. We also assess the current limitations and challenges for the application of deep learning in precision oncology, including the lack of phenotypically rich data and the need for more explainable deep learning models. Finally, we conclude with a discussion of how current obstacles can be overcome to enable future clinical utilisation of deep learning.
Collapse
Affiliation(s)
- Khoa A. Tran
- Department of Genetics and Computational Biology, QIMR Berghofer Medical Research Institute, Brisbane, 4006 Australia
- School of Biomedical Sciences, Faculty of Health, Queensland University of Technology (QUT), Brisbane, 4059 Australia
| | - Olga Kondrashova
- Department of Genetics and Computational Biology, QIMR Berghofer Medical Research Institute, Brisbane, 4006 Australia
| | - Andrew Bradley
- Faculty of Engineering, Queensland University of Technology (QUT), Brisbane, 4000 Australia
| | - Elizabeth D. Williams
- School of Biomedical Sciences, Faculty of Health, Queensland University of Technology (QUT), Brisbane, 4059 Australia
- Australian Prostate Cancer Research Centre - Queensland (APCRC-Q) and Queensland Bladder Cancer Initiative (QBCI), Brisbane, 4102 Australia
| | - John V. Pearson
- Department of Genetics and Computational Biology, QIMR Berghofer Medical Research Institute, Brisbane, 4006 Australia
| | - Nicola Waddell
- Department of Genetics and Computational Biology, QIMR Berghofer Medical Research Institute, Brisbane, 4006 Australia
| |
Collapse
|
47
|
Elmarakeby HA, Hwang J, Arafeh R, Crowdis J, Gang S, Liu D, AlDubayan SH, Salari K, Kregel S, Richter C, Arnoff TE, Park J, Hahn WC, Van Allen EM. Biologically informed deep neural network for prostate cancer discovery. Nature 2021; 598:348-352. [PMID: 34552244 PMCID: PMC8514339 DOI: 10.1038/s41586-021-03922-4] [Citation(s) in RCA: 189] [Impact Index Per Article: 47.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2020] [Accepted: 08/17/2021] [Indexed: 12/20/2022]
Abstract
The determination of molecular features that mediate clinically aggressive phenotypes in prostate cancer remains a major biological and clinical challenge1,2. Recent advances in interpretability of machine learning models as applied to biomedical problems may enable discovery and prediction in clinical cancer genomics3-5. Here we developed P-NET-a biologically informed deep learning model-to stratify patients with prostate cancer by treatment-resistance state and evaluate molecular drivers of treatment resistance for therapeutic targeting through complete model interpretability. We demonstrate that P-NET can predict cancer state using molecular data with a performance that is superior to other modelling approaches. Moreover, the biological interpretability within P-NET revealed established and novel molecularly altered candidates, such as MDM4 and FGFR1, which were implicated in predicting advanced disease and validated in vitro. Broadly, biologically informed fully interpretable neural networks enable preclinical discovery and clinical prediction in prostate cancer and may have general applicability across cancer types.
Collapse
Affiliation(s)
- Haitham A Elmarakeby
- Dana-Farber Cancer Institute, Boston, MA, USA.,Broad Institute of MIT and Harvard, Cambridge, MA, USA.,Al-Azhar University, Cairo, Egypt
| | - Justin Hwang
- University of Minnesota, Division of Hematology, Oncology and Transplantation, Minneapolis, MN, USA
| | - Rand Arafeh
- Dana-Farber Cancer Institute, Boston, MA, USA.,Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Jett Crowdis
- Dana-Farber Cancer Institute, Boston, MA, USA.,Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Sydney Gang
- Dana-Farber Cancer Institute, Boston, MA, USA
| | - David Liu
- Dana-Farber Cancer Institute, Boston, MA, USA.,Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Saud H AlDubayan
- Dana-Farber Cancer Institute, Boston, MA, USA.,Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Keyan Salari
- Dana-Farber Cancer Institute, Boston, MA, USA.,Broad Institute of MIT and Harvard, Cambridge, MA, USA.,Department of Urology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Steven Kregel
- Department of Pathology, University of Illinois at Chicago, Chicago, IL, USA
| | | | - Taylor E Arnoff
- Dana-Farber Cancer Institute, Boston, MA, USA.,Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Jihye Park
- Dana-Farber Cancer Institute, Boston, MA, USA.,Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - William C Hahn
- Dana-Farber Cancer Institute, Boston, MA, USA.,Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Eliezer M Van Allen
- Dana-Farber Cancer Institute, Boston, MA, USA. .,Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| |
Collapse
|
48
|
Bourgeais V, Zehraoui F, Ben Hamdoune M, Hanczar B. Deep GONet: self-explainable deep neural network based on Gene Ontology for phenotype prediction from gene expression data. BMC Bioinformatics 2021; 22:455. [PMID: 34551707 PMCID: PMC8456586 DOI: 10.1186/s12859-021-04370-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Accepted: 09/08/2021] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND With the rapid advancement of genomic sequencing techniques, massive production of gene expression data is becoming possible, which prompts the development of precision medicine. Deep learning is a promising approach for phenotype prediction (clinical diagnosis, prognosis, and drug response) based on gene expression profile. Existing deep learning models are usually considered as black-boxes that provide accurate predictions but are not interpretable. However, accuracy and interpretation are both essential for precision medicine. In addition, most models do not integrate the knowledge of the domain. Hence, making deep learning models interpretable for medical applications using prior biological knowledge is the main focus of this paper. RESULTS In this paper, we propose a new self-explainable deep learning model, called Deep GONet, integrating the Gene Ontology into the hierarchical architecture of the neural network. This model is based on a fully-connected architecture constrained by the Gene Ontology annotations, such that each neuron represents a biological function. The experiments on cancer diagnosis datasets demonstrate that Deep GONet is both easily interpretable and highly performant to discriminate cancer and non-cancer samples. CONCLUSIONS Our model provides an explanation to its predictions by identifying the most important neurons and associating them with biological functions, making the model understandable for biologists and physicians.
Collapse
Affiliation(s)
- Victoria Bourgeais
- IBISC, Univ Evry, Université Paris-Saclay, 91020 Évry-Courcouronnes, France
| | - Farida Zehraoui
- IBISC, Univ Evry, Université Paris-Saclay, 91020 Évry-Courcouronnes, France
| | | | - Blaise Hanczar
- IBISC, Univ Evry, Université Paris-Saclay, 91020 Évry-Courcouronnes, France
| |
Collapse
|
49
|
Levy JJ, Chen Y, Azizgolshani N, Petersen CL, Titus AJ, Moen EL, Vaickus LJ, Salas LA, Christensen BC. MethylSPWNet and MethylCapsNet: Biologically Motivated Organization of DNAm Neural Networks, Inspired by Capsule Networks. NPJ Syst Biol Appl 2021; 7:33. [PMID: 34417465 PMCID: PMC8379254 DOI: 10.1038/s41540-021-00193-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Accepted: 07/01/2021] [Indexed: 02/07/2023] Open
Abstract
DNA methylation (DNAm) alterations have been heavily implicated in carcinogenesis and the pathophysiology of diseases through upstream regulation of gene expression. DNAm deep-learning approaches are able to capture features associated with aging, cell type, and disease progression, but lack incorporation of prior biological knowledge. Here, we present modular, user-friendly deep-learning methodology and software, MethylCapsNet and MethylSPWNet, that group CpGs into biologically relevant capsules-such as gene promoter context, CpG island relationship, or user-defined groupings-and relate them to diagnostic and prognostic outcomes. We demonstrate these models' utility on 3,897 individuals in the classification of central nervous system (CNS) tumors. MethylCapsNet and MethylSPWNet provide an opportunity to increase DNAm deep-learning analyses' interpretability by enabling a flexible organization of DNAm data into biologically relevant capsules.
Collapse
Affiliation(s)
- Joshua J Levy
- Program in Quantitative Biomedical Sciences, Geisel School of Medicine at Dartmouth, Hanover, NH, USA.
- Department of Epidemiology, Geisel School of Medicine at Dartmouth, Hanover, NH, USA.
- Emerging Diagnostic and Investigative Technologies, Department of Pathology and Laboratory Medicine, Dartmouth Hitchcock Medical Center, Lebanon, NH, USA.
| | - Youdinghuan Chen
- Program in Quantitative Biomedical Sciences, Geisel School of Medicine at Dartmouth, Hanover, NH, USA
- Department of Epidemiology, Geisel School of Medicine at Dartmouth, Hanover, NH, USA
| | - Nasim Azizgolshani
- Department of Epidemiology, Geisel School of Medicine at Dartmouth, Hanover, NH, USA
| | - Curtis L Petersen
- Department of Epidemiology, Geisel School of Medicine at Dartmouth, Hanover, NH, USA
- The Dartmouth Institute for Health Policy and Clinical Practice, Lebanon, NH, USA
| | - Alexander J Titus
- Department of Life Sciences, University of New Hampshire, Manchester, NH, USA
| | - Erika L Moen
- The Dartmouth Institute for Health Policy and Clinical Practice, Lebanon, NH, USA
- Department of Biomedical Data Science, Geisel School of Medicine at Dartmouth, Hanover, NH, USA
| | - Louis J Vaickus
- Emerging Diagnostic and Investigative Technologies, Department of Pathology and Laboratory Medicine, Dartmouth Hitchcock Medical Center, Lebanon, NH, USA
| | - Lucas A Salas
- Department of Epidemiology, Geisel School of Medicine at Dartmouth, Hanover, NH, USA
- Department of Molecular and Systems Biology, Geisel School of Medicine at Dartmouth, Hanover, NH, USA
| | - Brock C Christensen
- Department of Epidemiology, Geisel School of Medicine at Dartmouth, Hanover, NH, USA
- Department of Molecular and Systems Biology, Geisel School of Medicine at Dartmouth, Hanover, NH, USA
- Department of Community and Family Medicine, Geisel School of Medicine at Dartmouth, Hanover, NH, USA
| |
Collapse
|
50
|
Sun T, He Y, Li W, Liu G, Li L, Wang L, Xiao Z, Han X, Wen H, Liu Y, Chen Y, Wang H, Li J, Fan Y, Zhang W, Zhang J. neoDL: a novel neoantigen intrinsic feature-based deep learning model identifies IDH wild-type glioblastomas with the longest survival. BMC Bioinformatics 2021; 22:382. [PMID: 34301201 PMCID: PMC8299600 DOI: 10.1186/s12859-021-04301-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2021] [Accepted: 07/07/2021] [Indexed: 12/18/2022] Open
Abstract
Background Neoantigen based personalized immune therapies achieve promising results in melanoma and lung cancer, but few neoantigen based models perform well in IDH wild-type GBM, and the association between neoantigen intrinsic features and prognosis remain unclear in IDH wild-type GBM. We presented a novel neoantigen intrinsic feature-based deep learning model (neoDL) to stratify IDH wild-type GBMs into subgroups with different survivals. Results We first derived intrinsic features for each neoantigen associated with survival, followed by applying neoDL in TCGA data cohort(AUC = 0.988, p value < 0.0001). Leave one out cross validation (LOOCV) in TCGA demonstrated that neoDL successfully classified IDH wild-type GBMs into different prognostic subgroups, which was further validated in an independent data cohort from Asian population. Long-term survival IDH wild-type GBMs identified by neoDL were found characterized by 12 protective neoantigen intrinsic features and enriched in development and cell cycle. Conclusions The model can be therapeutically exploited to identify IDH wild-type GBM with good prognosis who will most likely benefit from neoantigen based personalized immunetherapy. Furthermore, the prognostic intrinsic features of the neoantigens inferred from this study can be used for identifying neoantigens with high potentials of immunogenicity. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04301-6.
Collapse
Affiliation(s)
- Ting Sun
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing, 100083, People's Republic of China
| | - Yufei He
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing, 100083, People's Republic of China
| | - Wendong Li
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing, 100083, People's Republic of China
| | - Guang Liu
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing, 100083, People's Republic of China
| | - Lin Li
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing, 100083, People's Republic of China
| | - Lu Wang
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing, 100083, People's Republic of China
| | - Zixuan Xiao
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing, 100083, People's Republic of China
| | - Xiaohan Han
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing, 100083, People's Republic of China
| | - Hao Wen
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing, 100083, People's Republic of China
| | - Yong Liu
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing, 100083, People's Republic of China
| | - Yifan Chen
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing, 100083, People's Republic of China
| | - Haoyu Wang
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing, 100083, People's Republic of China
| | - Jing Li
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing, 100083, People's Republic of China
| | - Yubo Fan
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing, 100083, People's Republic of China.
| | - Wei Zhang
- Department of Molecular Neuropathology, Beijing Neurosurgical Institute, Capital Medical University, Beijing, 100070, People's Republic of China. .,Department of Neurosurgery, Beijing Tiantan Hospital, Capital Medical University, No. 119 South Fourth Ring Road West, Fengtai District, Beijing, 100070, People's Republic of China.
| | - Jing Zhang
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing, 100083, People's Republic of China.
| |
Collapse
|