1
|
Tafavvoghi M, Bongo LA, Shvetsov N, Busund LTR, Møllersen K. Publicly available datasets of breast histopathology H&E whole-slide images: A scoping review. J Pathol Inform 2024; 15:100363. [PMID: 38405160 PMCID: PMC10884505 DOI: 10.1016/j.jpi.2024.100363] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 11/24/2023] [Accepted: 01/23/2024] [Indexed: 02/27/2024] Open
Abstract
Advancements in digital pathology and computing resources have made a significant impact in the field of computational pathology for breast cancer diagnosis and treatment. However, access to high-quality labeled histopathological images of breast cancer is a big challenge that limits the development of accurate and robust deep learning models. In this scoping review, we identified the publicly available datasets of breast H&E-stained whole-slide images (WSIs) that can be used to develop deep learning algorithms. We systematically searched 9 scientific literature databases and 9 research data repositories and found 17 publicly available datasets containing 10 385 H&E WSIs of breast cancer. Moreover, we reported image metadata and characteristics for each dataset to assist researchers in selecting proper datasets for specific tasks in breast cancer computational pathology. In addition, we compiled 2 lists of breast H&E patches and private datasets as supplementary resources for researchers. Notably, only 28% of the included articles utilized multiple datasets, and only 14% used an external validation set, suggesting that the performance of other developed models may be susceptible to overestimation. The TCGA-BRCA was used in 52% of the selected studies. This dataset has a considerable selection bias that can impact the robustness and generalizability of the trained algorithms. There is also a lack of consistent metadata reporting of breast WSI datasets that can be an issue in developing accurate deep learning models, indicating the necessity of establishing explicit guidelines for documenting breast WSI dataset characteristics and metadata.
Collapse
Affiliation(s)
- Masoud Tafavvoghi
- Department of Community Medicine, Uit The Arctic University of Norway, Tromsø, Norway
| | - Lars Ailo Bongo
- Department of Computer Science, Uit The Arctic University of Norway, Tromsø, Norway
| | - Nikita Shvetsov
- Department of Computer Science, Uit The Arctic University of Norway, Tromsø, Norway
| | | | - Kajsa Møllersen
- Department of Community Medicine, Uit The Arctic University of Norway, Tromsø, Norway
| |
Collapse
|
2
|
Qiu L, Zhao L, Zhao W, Zhao J. Dual-space disentangled-multimodal network (DDM-net) for glioma diagnosis and prognosis with incomplete pathology and genomic data. Phys Med Biol 2024; 69:085028. [PMID: 38595094 DOI: 10.1088/1361-6560/ad37ec] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Accepted: 03/26/2024] [Indexed: 04/11/2024]
Abstract
Objective. Effective fusion of histology slides and molecular profiles from genomic data has shown great potential in the diagnosis and prognosis of gliomas. However, it remains challenging to explicitly utilize the consistent-complementary information among different modalities and create comprehensive representations of patients. Additionally, existing researches mainly focus on complete multi-modality data and usually fail to construct robust models for incomplete samples.Approach. In this paper, we propose adual-space disentangled-multimodal network (DDM-net)for glioma diagnosis and prognosis. DDM-net disentangles the latent features generated by two separate variational autoencoders (VAEs) into common and specific components through a dual-space disentangled approach, facilitating the construction of comprehensive representations of patients. More importantly, DDM-net imputes the unavailable modality in the latent feature space, making it robust to incomplete samples.Main results. We evaluated our approach on the TCGA-GBMLGG dataset for glioma grading and survival analysis tasks. Experimental results demonstrate that the proposed method achieves superior performance compared to state-of-the-art methods, with a competitive AUC of 0.952 and a C-index of 0.768.Significance. The proposed model may help the clinical understanding of gliomas and can serve as an effective fusion model with multimodal data. Additionally, it is capable of handling incomplete samples, making it less constrained by clinical limitations.
Collapse
Affiliation(s)
- Lu Qiu
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200240, People's Republic of China
| | - Lu Zhao
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200240, People's Republic of China
| | - Wangyuan Zhao
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200240, People's Republic of China
| | - Jun Zhao
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200240, People's Republic of China
| |
Collapse
|
3
|
Pan L, Peng Y, Li Y, Wang X, Liu W, Xu L, Liang Q, Peng S. SELECTOR: Heterogeneous graph network with convolutional masked autoencoder for multimodal robust prediction of cancer survival. Comput Biol Med 2024; 172:108301. [PMID: 38492453 DOI: 10.1016/j.compbiomed.2024.108301] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Revised: 02/03/2024] [Accepted: 03/12/2024] [Indexed: 03/18/2024]
Abstract
Accurately predicting the survival rate of cancer patients is crucial for aiding clinicians in planning appropriate treatment, reducing cancer-related medical expenses, and significantly enhancing patients' quality of life. Multimodal prediction of cancer patient survival offers a more comprehensive and precise approach. However, existing methods still grapple with challenges related to missing multimodal data and information interaction within modalities. This paper introduces SELECTOR, a heterogeneous graph-aware network based on convolutional mask encoders for robust multimodal prediction of cancer patient survival. SELECTOR comprises feature edge reconstruction, convolutional mask encoder, feature cross-fusion, and multimodal survival prediction modules. Initially, we construct a multimodal heterogeneous graph and employ the meta-path method for feature edge reconstruction, ensuring comprehensive incorporation of feature information from graph edges and effective embedding of nodes. To mitigate the impact of missing features within the modality on prediction accuracy, we devised a convolutional masked autoencoder (CMAE) to process the heterogeneous graph post-feature reconstruction. Subsequently, the feature cross-fusion module facilitates communication between modalities, ensuring that output features encompass all features of the modality and relevant information from other modalities. Extensive experiments and analysis on six cancer datasets from TCGA demonstrate that our method significantly outperforms state-of-the-art methods in both modality-missing and intra-modality information-confirmed cases. Our codes are made available at https://github.com/panliangrui/Selector.
Collapse
Affiliation(s)
- Liangrui Pan
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410083, Hunan, China.
| | - Yijun Peng
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410083, Hunan, China.
| | - Yan Li
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410083, Hunan, China.
| | - Xiang Wang
- Department of Thoracic Surgery, The second xiangya hospital, Central South University, Changsha, 410011, Hunan, China.
| | - Wenjuan Liu
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410083, Hunan, China.
| | - Liwen Xu
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410083, Hunan, China.
| | - Qingchun Liang
- Department of Pathology, The second xiangya hospital, Central South University, Changsha, 410011, Hunan, China.
| | - Shaoliang Peng
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410083, Hunan, China.
| |
Collapse
|
4
|
Vollmer A, Hartmann S, Vollmer M, Shavlokhova V, Brands RC, Kübler A, Wollborn J, Hassel F, Couillard-Despres S, Lang G, Saravi B. Multimodal artificial intelligence-based pathogenomics improves survival prediction in oral squamous cell carcinoma. Sci Rep 2024; 14:5687. [PMID: 38453964 PMCID: PMC10920832 DOI: 10.1038/s41598-024-56172-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Accepted: 03/03/2024] [Indexed: 03/09/2024] Open
Abstract
In this study, we aimed to develop a novel prognostic algorithm for oral squamous cell carcinoma (OSCC) using a combination of pathogenomics and AI-based techniques. We collected comprehensive clinical, genomic, and pathology data from a cohort of OSCC patients in the TCGA dataset and used machine learning and deep learning algorithms to identify relevant features that are predictive of survival outcomes. Our analyses included 406 OSCC patients. Initial analyses involved gene expression analyses, principal component analyses, gene enrichment analyses, and feature importance analyses. These insights were foundational for subsequent model development. Furthermore, we applied five machine learning/deep learning algorithms (Random Survival Forest, Gradient Boosting Survival Analysis, Cox PH, Fast Survival SVM, and DeepSurv) for survival prediction. Our initial analyses revealed relevant gene expression variations and biological pathways, laying the groundwork for robust feature selection in model building. The results showed that the multimodal model outperformed the unimodal models across all methods, with c-index values of 0.722 for RSF, 0.633 for GBSA, 0.625 for FastSVM, 0.633 for CoxPH, and 0.515 for DeepSurv. When considering only important features, the multimodal model continued to outperform the unimodal models, with c-index values of 0.834 for RSF, 0.747 for GBSA, 0.718 for FastSVM, 0.742 for CoxPH, and 0.635 for DeepSurv. Our results demonstrate the potential of pathogenomics and AI-based techniques in improving the accuracy of prognostic prediction in OSCC, which may ultimately aid in the development of personalized treatment strategies for patients with this devastating disease.
Collapse
Affiliation(s)
- Andreas Vollmer
- Department of Oral and Maxillofacial Plastic Surgery, University Hospital of Würzburg, 97070, Würzburg, Franconia, Germany.
| | - Stefan Hartmann
- Department of Oral and Maxillofacial Plastic Surgery, University Hospital of Würzburg, 97070, Würzburg, Franconia, Germany
| | - Michael Vollmer
- Department of Oral and Maxillofacial Surgery, Tuebingen University Hospital, Osianderstrasse 2-8, 72076, Tuebingen, Germany
| | - Veronika Shavlokhova
- Maxillofacial Surgery University Hospital Ruppin-Brandenburg, Fehrbelliner Straße 38, 16816, Neuruppin, Germany
| | - Roman C Brands
- Department of Oral and Maxillofacial Plastic Surgery, University Hospital of Würzburg, 97070, Würzburg, Franconia, Germany
| | - Alexander Kübler
- Department of Oral and Maxillofacial Plastic Surgery, University Hospital of Würzburg, 97070, Würzburg, Franconia, Germany
| | - Jakob Wollborn
- Department of Anesthesiology, Perioperative and Pain Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, USA
| | - Frank Hassel
- Department of Spine Surgery, Loretto Hospital, Freiburg, Germany
| | - Sebastien Couillard-Despres
- Institute of Experimental Neuroregeneration, Paracelsus Medical University, 5020, Salzburg, Austria
- Austrian Cluster for Tissue Regeneration, Vienna, Austria
| | - Gernot Lang
- Department of Orthopedics and Trauma Surgery, Medical Center - University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Babak Saravi
- Department of Anesthesiology, Perioperative and Pain Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, USA
- Department of Spine Surgery, Loretto Hospital, Freiburg, Germany
- Institute of Experimental Neuroregeneration, Paracelsus Medical University, 5020, Salzburg, Austria
- Department of Orthopedics and Trauma Surgery, Medical Center - University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| |
Collapse
|
5
|
Parvaiz A, Nasir ES, Fraz MM. From Pixels to Prognosis: A Survey on AI-Driven Cancer Patient Survival Prediction Using Digital Histology Images. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2024:10.1007/s10278-024-01049-2. [PMID: 38429563 DOI: 10.1007/s10278-024-01049-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 11/30/2023] [Accepted: 12/20/2023] [Indexed: 03/03/2024]
Abstract
Survival analysis is an integral part of medical statistics that is extensively utilized to establish prognostic indices for mortality or disease recurrence, assess treatment efficacy, and tailor effective treatment plans. The identification of prognostic biomarkers capable of predicting patient survival is a primary objective in the field of cancer research. With the recent integration of digital histology images into routine clinical practice, a plethora of Artificial Intelligence (AI)-based methods for digital pathology has emerged in scholarly literature, facilitating patient survival prediction. These methods have demonstrated remarkable proficiency in analyzing and interpreting whole slide images, yielding results comparable to those of expert pathologists. The complexity of AI-driven techniques is magnified by the distinctive characteristics of digital histology images, including their gigapixel size and diverse tissue appearances. Consequently, advanced patch-based methods are employed to effectively extract features that correlate with patient survival. These computational methods significantly enhance survival prediction accuracy and augment prognostic capabilities in cancer patients. The review discusses the methodologies employed in the literature, their performance metrics, ongoing challenges, and potential solutions for future advancements. This paper explains survival analysis and feature extraction methods for analyzing cancer patients. It also compiles essential acronyms related to cancer precision medicine. Furthermore, it is noteworthy that this is the inaugural review paper in the field. The target audience for this interdisciplinary review comprises AI practitioners, medical statisticians, and progressive oncologists who are enthusiastic about translating AI-driven solutions into clinical practice. We expect this comprehensive review article to guide future research directions in the field of cancer research.
Collapse
Affiliation(s)
- Arshi Parvaiz
- National University of Sciences and Technology (NUST), Islamabad, Pakistan
| | - Esha Sadia Nasir
- National University of Sciences and Technology (NUST), Islamabad, Pakistan
| | | |
Collapse
|
6
|
Al Turkestani N, Li T, Bianchi J, Gurgel M, Prieto J, Shah H, Benavides E, Soki F, Mishina Y, Fontana M, Rao A, Zhu H, Cevidanes L. A comprehensive patient-specific prediction model for temporomandibular joint osteoarthritis progression. Proc Natl Acad Sci U S A 2024; 121:e2306132121. [PMID: 38346188 PMCID: PMC10895339 DOI: 10.1073/pnas.2306132121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Accepted: 01/03/2024] [Indexed: 02/15/2024] Open
Abstract
Temporomandibular joint osteoarthritis (TMJ OA) is a prevalent degenerative disease characterized by chronic pain and impaired jaw function. The complexity of TMJ OA has hindered the development of prognostic tools, posing a significant challenge in timely, patient-specific management. Addressing this gap, our research employs a comprehensive, multidimensional approach to advance TMJ OA prognostication. We conducted a prospective study with 106 subjects, 74 of whom were followed up after 2 to 3 y of conservative treatment. Central to our methodology is the development of an innovative, open-source predictive modeling framework, the Ensemble via Hierarchical Predictions through Nested cross-validation tool (EHPN). This framework synergistically integrates 18 feature selection, statistical, and machine learning methods to yield an accuracy of 0.87, with an area under the ROC curve of 0.72 and an F1 score of 0.82. Our study, beyond technical advancements, emphasizes the global impact of TMJ OA, recognizing its unique demographic occurrence. We highlight key factors influencing TMJ OA progression. Using SHAP analysis, we identified personalized prognostic predictors: lower values of headache, lower back pain, restless sleep, condyle high gray level-GL-run emphasis, articular fossa GL nonuniformity, and long-run low GL emphasis; and higher values of superior joint space, mouth opening, saliva Vascular-endothelium-growth-factor, Matrix-metalloproteinase-7, serum Epithelial-neutrophil-activating-peptide, and age indicate recovery likelihood. Our multidimensional and multimodal EHPN tool enhances clinicians' decision-making, offering a transformative translational infrastructure. The EHPN model stands as a significant contribution to precision medicine, offering a paradigm shift in the management of temporomandibular disorders and potentially influencing broader applications in personalized healthcare.
Collapse
Affiliation(s)
- Najla Al Turkestani
- Department of Restorative Dentistry, Faculty of Dentistry, King Abdulaziz University, Jeddah21589, Saudi Arabia
- Department of Orthodontics and Pediatric Dentistry, School of Dentistry, University of Michigan, Ann Arbor, MI48109
| | - Tengfei Li
- Department of Psychiatry, The University of North Carolina at Chapel Hill, Chapel Hill, NC27599
| | - Jonas Bianchi
- Department of Orthodontics, University of the Pacific, Arthur A. Dugoni School of Dentistry, San Francisco, CA94103
| | - Marcela Gurgel
- Department of Orthodontics and Pediatric Dentistry, School of Dentistry, University of Michigan, Ann Arbor, MI48109
| | - Juan Prieto
- Department of Psychiatry, The University of North Carolina at Chapel Hill, Chapel Hill, NC27599
| | - Hina Shah
- Department of Psychiatry, The University of North Carolina at Chapel Hill, Chapel Hill, NC27599
| | - Erika Benavides
- Department of Periodontics & Oral Medicine, School of Dentistry, University of Michigan, Ann Arbor, MI48109
| | - Fabiana Soki
- Department of Periodontics & Oral Medicine, School of Dentistry, University of Michigan, Ann Arbor, MI48109
| | - Yuji Mishina
- Department of Biologic and Materials Sciences & Prosthodontics, School of Dentistry, University of Michigan, Ann Arbor, MI48109
| | - Margherita Fontana
- Department of Cariology, Restorative Sciences and Endodontics, School of Dentistry, University of Michigan, Ann Arbor, MI48109
| | - Arvind Rao
- Department of Radiation Oncology, University of Michigan, Ann Arbor, MI48109
- Department of Computational Medicine & Bioinformatics, School of Dentistry, University of Michigan, Ann Arbor, MI48109
| | - Hongtu Zhu
- Department of Radiology and Biomedical Research Imaging Center, The University of North Carolina at Chapel Hill, Chapel Hill, NC27599
| | - Lucia Cevidanes
- Department of Orthodontics and Pediatric Dentistry, School of Dentistry, University of Michigan, Ann Arbor, MI48109
| |
Collapse
|
7
|
Feng X, Shu W, Li M, Li J, Xu J, He M. Pathogenomics for accurate diagnosis, treatment, prognosis of oncology: a cutting edge overview. J Transl Med 2024; 22:131. [PMID: 38310237 PMCID: PMC10837897 DOI: 10.1186/s12967-024-04915-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Accepted: 01/20/2024] [Indexed: 02/05/2024] Open
Abstract
The capability to gather heterogeneous data, alongside the increasing power of artificial intelligence to examine it, leading a revolution in harnessing multimodal data in the life sciences. However, most approaches are limited to unimodal data, leaving integrated approaches across modalities relatively underdeveloped in computational pathology. Pathogenomics, as an invasive method to integrate advanced molecular diagnostics from genomic data, morphological information from histopathological imaging, and codified clinical data enable the discovery of new multimodal cancer biomarkers to propel the field of precision oncology in the coming decade. In this perspective, we offer our opinions on synthesizing complementary modalities of data with emerging multimodal artificial intelligence methods in pathogenomics. It includes correlation between the pathological and genomic profile of cancer, fusion of histology, and genomics profile of cancer. We also present challenges, opportunities, and avenues for future work.
Collapse
Affiliation(s)
- Xiaobing Feng
- College of Electrical and Information Engineering, Hunan University, Changsha, China
- Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, Hangzhou, 310022, Zhejiang, China
| | - Wen Shu
- College of Electrical and Information Engineering, Hunan University, Changsha, China
- Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, Hangzhou, 310022, Zhejiang, China
| | - Mingya Li
- College of Electrical and Information Engineering, Hunan University, Changsha, China
- Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, Hangzhou, 310022, Zhejiang, China
| | - Junyu Li
- College of Electrical and Information Engineering, Hunan University, Changsha, China
- Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, Hangzhou, 310022, Zhejiang, China
| | - Junyao Xu
- Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, Hangzhou, 310022, Zhejiang, China
| | - Min He
- College of Electrical and Information Engineering, Hunan University, Changsha, China.
- Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, Hangzhou, 310022, Zhejiang, China.
| |
Collapse
|
8
|
Cai H, Liao Y, Zhu L, Wang Z, Song J. Improving Cancer Survival Prediction via Graph Convolutional Neural Network Learning on Protein-Protein Interaction Networks. IEEE J Biomed Health Inform 2024; 28:1134-1143. [PMID: 37963003 DOI: 10.1109/jbhi.2023.3332640] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2023]
Abstract
Cancer is one of the most challenging health problems worldwide. Accurate cancer survival prediction is vital for clinical decision making. Many deep learning methods have been proposed to understand the association between patients' genomic features and survival time. In most cases, the gene expression matrix is fed directly to the deep learning model. However, this approach completely ignores the interactions between biomolecules, and the resulting models can only learn the expression levels of genes to predict patient survival. In essence, the interaction between biomolecules is the key to determining the direction and function of biological processes. Proteins are the building blocks and principal undertakings of life activities, and as such, their complex interaction network is potentially informative for deep learning methods. Therefore, a more reliable approach is to have the neural network learn both gene expression data and protein interaction networks. We propose a new computational approach, termed CRESCENT, which is a protein-protein interaction (PPI) prior knowledge graph-based convolutional neural network (GCN) to improve cancer survival prediction. CRESCENT relies on the gene expression networks rather than gene expression levels to predict patient survival. The performance of CRESCENT is evaluated on a large-scale pan-cancer dataset consisting of 5991 patients from 16 different types of cancers. Extensive benchmarking experiments demonstrate that our proposed method is competitive in terms of the evaluation metric of the time-dependent concordance index( Ctd) when compared with several existing state-of-the-art approaches. Experiments also show that incorporating the network structure between genomic features effectively improves cancer survival prediction.
Collapse
|
9
|
Mahootiha M, Qadir HA, Aghayan D, Fretland ÅA, von Gohren Edwin B, Balasingham I. Deep learning-assisted survival prognosis in renal cancer: A CT scan-based personalized approach. Heliyon 2024; 10:e24374. [PMID: 38298725 PMCID: PMC10828686 DOI: 10.1016/j.heliyon.2024.e24374] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Revised: 12/19/2023] [Accepted: 01/08/2024] [Indexed: 02/02/2024] Open
Abstract
This paper presents a deep learning (DL) approach for predicting survival probabilities of renal cancer patients based solely on preoperative CT imaging. The proposed approach consists of two networks: a classifier- and a survival- network. The classifier attempts to extract features from 3D CT scans to predict the ISUP grade of Renal cell carcinoma (RCC) tumors, as defined by the International Society of Urological Pathology (ISUP). Our classifier is a 3D convolutional neural network to avoid losing crucial information on the interconnection of slides in 3D images. We employ multiple procedures, including image augmentation, preprocessing, and concatenation, to improve the performance of the classifier. Given the strong correlation between ISUP grading and renal cancer prognosis in the clinical context, we use the ISUP grading features extracted by the classifier as the input to the survival network. By leveraging this clinical association and the classifier network, we are able to model our survival analysis using a simple DL-based network. We adopt a discrete LogisticHazard-based loss to extract intrinsic survival characteristics of RCC tumors from CT images. This allows us to build a completely parametric survival model that varies with patients' tumor characteristics and predicts non-proportional survival probability curves for different patients. Our results demonstrated that the proposed method could predict the future course of renal cancer with reasonable accuracy from the CT scans. The proposed method obtained an average concordance index of 0.72, an integrated Brier score of 0.15, and an area under the curve value of 0.71 on the test cohorts.
Collapse
Affiliation(s)
- Maryamalsadat Mahootiha
- The Intervention Centre, Oslo University Hospital, Oslo, 0372, Norway
- Faculty of Medicine, University of Oslo, Oslo, 0372, Norway
| | - Hemin Ali Qadir
- The Intervention Centre, Oslo University Hospital, Oslo, 0372, Norway
| | - Davit Aghayan
- The Intervention Centre, Oslo University Hospital, Oslo, 0372, Norway
| | | | - Bjørn von Gohren Edwin
- The Intervention Centre, Oslo University Hospital, Oslo, 0372, Norway
- Faculty of Medicine, University of Oslo, Oslo, 0372, Norway
| | - Ilangko Balasingham
- The Intervention Centre, Oslo University Hospital, Oslo, 0372, Norway
- Department of Electronic Systems, Norwegian University of Science and Technology, Trondheim, Norway
| |
Collapse
|
10
|
Karampuri A, Perugu S. A breast cancer-specific combinational QSAR model development using machine learning and deep learning approaches. FRONTIERS IN BIOINFORMATICS 2024; 3:1328262. [PMID: 38288043 PMCID: PMC10822965 DOI: 10.3389/fbinf.2023.1328262] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Accepted: 12/21/2023] [Indexed: 01/31/2024] Open
Abstract
Breast cancer is the most prevalent and heterogeneous form of cancer affecting women worldwide. Various therapeutic strategies are in practice based on the extent of disease spread, such as surgery, chemotherapy, radiotherapy, and immunotherapy. Combinational therapy is another strategy that has proven to be effective in controlling cancer progression. Administration of Anchor drug, a well-established primary therapeutic agent with known efficacy for specific targets, with Library drug, a supplementary drug to enhance the efficacy of anchor drugs and broaden the therapeutic approach. Our work focused on harnessing regression-based Machine learning (ML) and deep learning (DL) algorithms to develop a structure-activity relationship between the molecular descriptors of drug pairs and their combined biological activity through a QSAR (Quantitative structure-activity relationship) model. 11 popularly known machine learning and deep learning algorithms were used to develop QSAR models. A total of 52 breast cancer cell lines, 25 anchor drugs, and 51 library drugs were considered in developing the QSAR model. It was observed that Deep Neural Networks (DNNs) achieved an impressive R2 (Coefficient of Determination) of 0.94, with an RMSE (Root Mean Square Error) value of 0.255, making it the most effective algorithm for developing a structure-activity relationship with strong generalization capabilities. In conclusion, applying combinational therapy alongside ML and DL techniques represents a promising approach to combating breast cancer.
Collapse
Affiliation(s)
| | - Shyam Perugu
- Department of Biotechnology, National Institute of Technology, Warangal, India
| |
Collapse
|
11
|
Jiang L, Xu C, Bai Y, Liu A, Gong Y, Wang YP, Deng HW. Autosurv: interpretable deep learning framework for cancer survival analysis incorporating clinical and multi-omics data. NPJ Precis Oncol 2024; 8:4. [PMID: 38182734 PMCID: PMC10770412 DOI: 10.1038/s41698-023-00494-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Accepted: 12/05/2023] [Indexed: 01/07/2024] Open
Abstract
Accurate prognosis for cancer patients can provide critical information for optimizing treatment plans and improving life quality. Combining omics data and demographic/clinical information can offer a more comprehensive view of cancer prognosis than using omics or clinical data alone and can also reveal the underlying disease mechanisms at the molecular level. In this study, we developed and validated a deep learning framework to extract information from high-dimensional gene expression and miRNA expression data and conduct prognosis prediction for breast cancer and ovarian-cancer patients using multiple independent multi-omics datasets. Our model achieved significantly better prognosis prediction than the current machine learning and deep learning approaches in various settings. Moreover, an interpretation method was applied to tackle the "black-box" nature of deep neural networks and we identified features (i.e., genes, miRNA, demographic/clinical variables) that were important to distinguish predicted high- and low-risk patients. The significance of the identified features was partially supported by previous studies.
Collapse
Affiliation(s)
- Lindong Jiang
- Tulane Center of Biomedical Informatics and Genomics, School of Medicine, Tulane University, New Orleans, LA, 70112, USA
| | - Chao Xu
- Department of Biostatistics and Epidemiology, University of Oklahoma Health Sciences Center, Oklahoma City, OK, 73104, USA
| | - Yuntong Bai
- Department of Biomedical Engineering, School of Science and Engineering, Tulane University, New Orleans, LA, 70118, USA
| | - Anqi Liu
- Tulane Center of Biomedical Informatics and Genomics, School of Medicine, Tulane University, New Orleans, LA, 70112, USA
| | - Yun Gong
- Tulane Center of Biomedical Informatics and Genomics, School of Medicine, Tulane University, New Orleans, LA, 70112, USA
| | - Yu-Ping Wang
- Department of Biomedical Engineering, School of Science and Engineering, Tulane University, New Orleans, LA, 70118, USA
| | - Hong-Wen Deng
- Tulane Center of Biomedical Informatics and Genomics, School of Medicine, Tulane University, New Orleans, LA, 70112, USA.
| |
Collapse
|
12
|
Rajaram S, Mitchell CS. Data Augmentation with Cross-Modal Variational Autoencoders (DACMVA) for Cancer Survival Prediction. INFORMATION 2024; 15:7. [PMID: 38665395 PMCID: PMC11044918 DOI: 10.3390/info15010007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/28/2024] Open
Abstract
The ability to translate Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs) into different modalities and data types is essential to improve Deep Learning (DL) for predictive medicine. This work presents DACMVA, a novel framework to conduct data augmentation in a cross-modal dataset by translating between modalities and oversampling imputations of missing data. DACMVA was inspired by previous work on the alignment of latent spaces in Autoencoders. DACMVA is a DL data augmentation pipeline that improves the performance in a downstream prediction task. The unique DACMVA framework leverages a cross-modal loss to improve the imputation quality and employs training strategies to enable regularized latent spaces. Oversampling of augmented data is integrated into the prediction training. It is empirically demonstrated that the new DACMVA framework is effective in the often-neglected scenario of DL training on tabular data with continuous labels. Specifically, DACMVA is applied towards cancer survival prediction on tabular gene expression data where there is a portion of missing data in a given modality. DACMVA significantly (p << 0.001, one-sided Wilcoxon signed-rank test) outperformed the non-augmented baseline and competing augmentation methods with varying percentages of missing data (4%, 90%, 95% missing). As such, DACMVA provides significant performance improvements, even in very-low-data regimes, over existing state-of-the-art methods, including TDImpute and oversampling alone.
Collapse
Affiliation(s)
- Sara Rajaram
- Laboratory for Pathology Dynamics, Georgia Institute of Technology and Emory University, Atlanta, GA 30332, USA
| | - Cassie S. Mitchell
- Laboratory for Pathology Dynamics, Georgia Institute of Technology and Emory University, Atlanta, GA 30332, USA
- Center for Machine Learning at Georgia Tech, Georgia Institute of Technology, Atlanta, GA 30332, USA
| |
Collapse
|
13
|
Theis M, Block W, Luetkens JA, Attenberger UI, Nowak S, Sprinkart AM. Direct deep learning-based survival prediction from pre-interventional CT prior to transcatheter aortic valve replacement. Eur J Radiol 2023; 168:111150. [PMID: 37844428 DOI: 10.1016/j.ejrad.2023.111150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Revised: 09/27/2023] [Accepted: 10/10/2023] [Indexed: 10/18/2023]
Abstract
PURPOSE To investigate survival prediction in patients undergoing transcatheter aortic valve replacement (TAVR) using deep learning (DL) methods applied directly to pre-interventional CT images and to compare performance with survival models based on scalar markers of body composition. METHOD This retrospective single-center study included 760 patients undergoing TAVR (mean age 81 ± 6 years; 389 female). As a baseline, a Cox proportional hazards model (CPHM) was trained to predict survival on sex, age, and the CT body composition markers fatty muscle fraction (FMF), skeletal muscle radiodensity (SMRD), and skeletal muscle area (SMA) derived from paraspinal muscle segmentation of a single slice at L3/L4 level. The convolutional neural network (CNN) encoder of the DL model for survival prediction was pre-trained in an autoencoder setting with and without a focus on paraspinal muscles. Finally, a combination of DL and CPHM was evaluated. Performance was assessed by C-index and area under the receiver operating curve (AUC) for 1-year and 2-year survival. All methods were trained with five-fold cross-validation and were evaluated on 152 hold-out test cases. RESULTS The CNN for direct image-based survival prediction, pre-trained in a focussed autoencoder scenario, outperformed the baseline CPHM (CPHM: C-index = 0.608, 1Y-AUC = 0.606, 2Y-AUC = 0.594 vs. DL: C-index = 0.645, 1Y-AUC = 0.687, 2Y-AUC = 0.692). Combining DL and CPHM led to further improvement (C-index = 0.668, 1Y-AUC = 0.713, 2Y-AUC = 0.696). CONCLUSIONS Direct DL-based survival prediction shows potential to improve image feature extraction compared to segmentation-based scalar markers of body composition for risk assessment in TAVR patients.
Collapse
Affiliation(s)
- Maike Theis
- Department of Diagnostic and Interventional Radiology, Quantitative Imaging Lab Bonn (QILaB), University Hospital Bonn, Venusberg-Campus 1, 53127 Bonn, Germany.
| | - Wolfgang Block
- Department of Diagnostic and Interventional Radiology, Quantitative Imaging Lab Bonn (QILaB), University Hospital Bonn, Venusberg-Campus 1, 53127 Bonn, Germany; Department of Radiotherapy and Radiation Oncology, University Hospital Bonn, Venusberg-Campus 1, 53127 Bonn, Germany; Department of Neuroradiology, University Hospital Bonn, Venusberg-Campus 1, 53127 Bonn, Germany.
| | - Julian A Luetkens
- Department of Diagnostic and Interventional Radiology, Quantitative Imaging Lab Bonn (QILaB), University Hospital Bonn, Venusberg-Campus 1, 53127 Bonn, Germany.
| | - Ulrike I Attenberger
- Department of Diagnostic and Interventional Radiology, Quantitative Imaging Lab Bonn (QILaB), University Hospital Bonn, Venusberg-Campus 1, 53127 Bonn, Germany.
| | - Sebastian Nowak
- Department of Diagnostic and Interventional Radiology, Quantitative Imaging Lab Bonn (QILaB), University Hospital Bonn, Venusberg-Campus 1, 53127 Bonn, Germany.
| | - Alois M Sprinkart
- Department of Diagnostic and Interventional Radiology, Quantitative Imaging Lab Bonn (QILaB), University Hospital Bonn, Venusberg-Campus 1, 53127 Bonn, Germany.
| |
Collapse
|
14
|
Ellen JG, Jacob E, Nikolaou N, Markuzon N. Autoencoder-based multimodal prediction of non-small cell lung cancer survival. Sci Rep 2023; 13:15761. [PMID: 37737469 PMCID: PMC10517020 DOI: 10.1038/s41598-023-42365-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Accepted: 09/09/2023] [Indexed: 09/23/2023] Open
Abstract
The ability to accurately predict non-small cell lung cancer (NSCLC) patient survival is crucial for informing physician decision-making, and the increasing availability of multi-omics data offers the promise of enhancing prognosis predictions. We present a multimodal integration approach that leverages microRNA, mRNA, DNA methylation, long non-coding RNA (lncRNA) and clinical data to predict NSCLC survival and identify patient subtypes, utilizing denoising autoencoders for data compression and integration. Survival performance for patients with lung adenocarcinoma (LUAD) and squamous cell carcinoma (LUSC) was compared across modality combinations and data integration methods. Using The Cancer Genome Atlas data, our results demonstrate that survival prediction models combining multiple modalities outperform single modality models. The highest performance was achieved with a combination of only two modalities, lncRNA and clinical, at concordance indices (C-indices) of 0.69 ± 0.03 for LUAD and 0.62 ± 0.03 for LUSC. Models utilizing all five modalities achieved mean C-indices of 0.67 ± 0.04 and 0.63 ± 0.02 for LUAD and LUSC, respectively, while the best individual modality performance reached C-indices of 0.64 ± 0.03 for LUAD and 0.59 ± 0.03 for LUSC. Analysis of biological differences revealed two distinct survival subtypes with over 900 differentially expressed transcripts.
Collapse
Affiliation(s)
- Jacob G Ellen
- Institute of Health Informatics, University College London, London, UK.
| | - Etai Jacob
- AstraZeneca, Oncology Data Science, Waltham, MA, USA
| | | | | |
Collapse
|
15
|
Isaksson LJ, Summers P, Mastroleo F, Marvaso G, Corrao G, Vincini MG, Zaffaroni M, Ceci F, Petralia G, Orecchia R, Jereczek-Fossa BA. Automatic Segmentation with Deep Learning in Radiotherapy. Cancers (Basel) 2023; 15:4389. [PMID: 37686665 PMCID: PMC10486603 DOI: 10.3390/cancers15174389] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Revised: 08/28/2023] [Accepted: 08/30/2023] [Indexed: 09/10/2023] Open
Abstract
This review provides a formal overview of current automatic segmentation studies that use deep learning in radiotherapy. It covers 807 published papers and includes multiple cancer sites, image types (CT/MRI/PET), and segmentation methods. We collect key statistics about the papers to uncover commonalities, trends, and methods, and identify areas where more research might be needed. Moreover, we analyzed the corpus by posing explicit questions aimed at providing high-quality and actionable insights, including: "What should researchers think about when starting a segmentation study?", "How can research practices in medical image segmentation be improved?", "What is missing from the current corpus?", and more. This allowed us to provide practical guidelines on how to conduct a good segmentation study in today's competitive environment that will be useful for future research within the field, regardless of the specific radiotherapeutic subfield. To aid in our analysis, we used the large language model ChatGPT to condense information.
Collapse
Affiliation(s)
- Lars Johannes Isaksson
- Division of Radiation Oncology, IEO European Institute of Oncology IRCCS, 20141 Milan, Italy; (L.J.I.); (F.M.); (G.C.); (M.G.V.); (M.Z.); (B.A.J.-F.)
- Department of Oncology and Hemato-Oncology, University of Milan, 20141 Milan, Italy; (F.C.); (G.P.)
| | - Paul Summers
- Division of Radiology, IEO European Institute of Oncology IRCCS, 20141 Milan, Italy;
| | - Federico Mastroleo
- Division of Radiation Oncology, IEO European Institute of Oncology IRCCS, 20141 Milan, Italy; (L.J.I.); (F.M.); (G.C.); (M.G.V.); (M.Z.); (B.A.J.-F.)
- Department of Translational Medicine, University of Piemonte Orientale (UPO), 20188 Novara, Italy
| | - Giulia Marvaso
- Division of Radiation Oncology, IEO European Institute of Oncology IRCCS, 20141 Milan, Italy; (L.J.I.); (F.M.); (G.C.); (M.G.V.); (M.Z.); (B.A.J.-F.)
| | - Giulia Corrao
- Division of Radiation Oncology, IEO European Institute of Oncology IRCCS, 20141 Milan, Italy; (L.J.I.); (F.M.); (G.C.); (M.G.V.); (M.Z.); (B.A.J.-F.)
| | - Maria Giulia Vincini
- Division of Radiation Oncology, IEO European Institute of Oncology IRCCS, 20141 Milan, Italy; (L.J.I.); (F.M.); (G.C.); (M.G.V.); (M.Z.); (B.A.J.-F.)
| | - Mattia Zaffaroni
- Division of Radiation Oncology, IEO European Institute of Oncology IRCCS, 20141 Milan, Italy; (L.J.I.); (F.M.); (G.C.); (M.G.V.); (M.Z.); (B.A.J.-F.)
| | - Francesco Ceci
- Department of Oncology and Hemato-Oncology, University of Milan, 20141 Milan, Italy; (F.C.); (G.P.)
- Division of Nuclear Medicine, IEO European Institute of Oncology IRCCS, 20141 Milan, Italy
| | - Giuseppe Petralia
- Department of Oncology and Hemato-Oncology, University of Milan, 20141 Milan, Italy; (F.C.); (G.P.)
- Precision Imaging and Research Unit, Department of Medical Imaging and Radiation Sciences, IEO European Institute of Oncology IRCCS, 20141 Milan, Italy
| | - Roberto Orecchia
- Scientific Directorate, IEO European Institute of Oncology IRCCS, 20141 Milan, Italy;
| | - Barbara Alicja Jereczek-Fossa
- Division of Radiation Oncology, IEO European Institute of Oncology IRCCS, 20141 Milan, Italy; (L.J.I.); (F.M.); (G.C.); (M.G.V.); (M.Z.); (B.A.J.-F.)
- Department of Oncology and Hemato-Oncology, University of Milan, 20141 Milan, Italy; (F.C.); (G.P.)
| |
Collapse
|
16
|
Pham TD. Prediction of Five-Year Survival Rate for Rectal Cancer Using Markov Models of Convolutional Features of RhoB Expression on Tissue Microarray. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:3195-3204. [PMID: 37155403 DOI: 10.1109/tcbb.2023.3274211] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
The ability to predict survival in cancer is clinically important because the finding can help patients and physicians make optimal treatment decisions. Artificial intelligence in the context of deep learning has been increasingly realized by the informatics-oriented medical community as a powerful machine-learning technology for cancer research, diagnosis, prediction, and treatment. This paper presents the combination of deep learning, data coding, and probabilistic modeling for predicting five-year survival in a cohort of patients with rectal cancer using images of RhoB expression on biopsies. Using about one-third of the patients' data for testing, the proposed approach achieved 90% prediction accuracy, which is much higher than the direct use of the best pretrained convolutional neural network (70%) and the best coupling of a pretrained model and support vector machines (70%).
Collapse
|
17
|
Moon JW, Yang E, Kim JH, Kwon OJ, Park M, Yi CA. Predicting Non-Small-Cell Lung Cancer Survival after Curative Surgery via Deep Learning of Diffusion MRI. Diagnostics (Basel) 2023; 13:2555. [PMID: 37568918 PMCID: PMC10417371 DOI: 10.3390/diagnostics13152555] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Revised: 07/19/2023] [Accepted: 07/27/2023] [Indexed: 08/13/2023] Open
Abstract
BACKGROUND the objective of this study is to evaluate the predictive power of the survival model using deep learning of diffusion-weighted images (DWI) in patients with non-small-cell lung cancer (NSCLC). METHODS DWI at b-values of 0, 100, and 700 sec/mm2 (DWI0, DWI100, DWI700) were preoperatively obtained for 100 NSCLC patients who underwent curative surgery (57 men, 43 women; mean age, 62 years). The ADC0-100 (perfusion-sensitive ADC), ADC100-700 (perfusion-insensitive ADC), ADC0-100-700, and demographic features were collected as input data and 5-year survival was collected as output data. Our survival model adopted transfer learning from a pre-trained VGG-16 network, whereby the softmax layer was replaced with the binary classification layer for the prediction of 5-year survival. Three channels of input data were selected in combination out of DWIs and ADC images and their accuracies and AUCs were compared for the best performance during 10-fold cross validation. RESULTS 66 patients survived, and 34 patients died. The predictive performance was the best in the following combination: DWI0-ADC0-100-ADC0-100-700 (accuracy: 92%; AUC: 0.904). This was followed by DWI0-DWI700-ADC0-100-700, DWI0-DWI100-DWI700, and DWI0-DWI0-DWI0 (accuracy: 91%, 81%, 76%; AUC: 0.889, 0.763, 0.711, respectively). Survival prediction models trained with ADC performed significantly better than the one trained with DWI only (p-values < 0.05). The survival prediction was improved when demographic features were added to the model with only DWIs, but the benefit of clinical information was not prominent when added to the best performing model using both DWI and ADC. CONCLUSIONS Deep learning may play a role in the survival prediction of lung cancer. The performance of learning can be enhanced by inputting precedented, proven functional parameters of the ADC instead of the original data of DWIs only.
Collapse
Affiliation(s)
- Jung Won Moon
- Department of Radiology, Kangnam Sacred Heart Hospital, Hallym University School of Medicine, Seoul 07441, Republic of Korea;
| | - Ehwa Yang
- Department of Radiology, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul 06351, Republic of Korea;
| | - Jae-Hun Kim
- Department of Radiology, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul 06351, Republic of Korea;
| | - O Jung Kwon
- Division of Respiratory and Critical Care Medicine, Department of Internal Medicine, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul 06351, Republic of Korea;
| | - Minsu Park
- Department of Information and Statistics, Chungnam National University, Daejeon 34134, Republic of Korea;
| | - Chin A Yi
- Department of Radiology, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul 06351, Republic of Korea;
| |
Collapse
|
18
|
Pfänder L, Schneider L, Büttner M, Krois J, Meyer-Lueckel H, Schwendicke F. Multi-modal deep learning for automated assembly of periapical radiographs. J Dent 2023; 135:104588. [PMID: 37348642 DOI: 10.1016/j.jdent.2023.104588] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Revised: 03/23/2023] [Accepted: 06/13/2023] [Indexed: 06/24/2023] Open
Abstract
OBJECTIVES Periapical radiographs are oftentimes taken in series to display all teeth present in the oral cavity. Our aim was to automatically assemble such a series of periapical radiographs into an anatomically correct status using a multi-modal deep learning model. METHODS 4,707 periapical images from 387 patients (on average, 12 images per patient) were used. Radiographs were labeled according to their field of view and the dataset split into a training, validation, and test set, stratified by patient. In addition to the radiograph the timestamp of image generation was extracted and abstracted as follows: A matrix, containing the normalized timestamps of all images of a patient was constructed, representing the order in which images were taken, providing temporal context information to the deep learning model. Using the image data together with the time sequence data a multi-modal deep learning model consisting of two residual convolutional neural networks (ResNet-152 for image data, ResNet-50 for time data) was trained. Additionally, two uni-modal models were trained on image data and time data, respectively. A custom scoring technique was used to measure model performance. RESULTS Multi-modal deep learning outperformed both uni-modal image-based learning (p<0.001) and time-based learning (p<0.05). The multi-modal deep learning model predicted tooth labels with an F1-score, sensitivity and precision of 0.79, respectively, and an accuracy of 0.99. 37 out of 77 patient datasets were fully correctly assembled by multi-modal learning; in the remaining ones, usually only one image was incorrectly labeled. CONCLUSIONS Multi-modal modeling allowed automated assembly of periapical radiographs and outperformed both uni-modal models. Dental machine learning models can benefit from additional data modalities. CLINICAL SIGNIFICANCE Like humans, deep learning models may profit from multiple data sources for decision-making. We demonstrate how multi-modal learning can assist assembling periapical radiographs into an anatomically correct status. Multi-modal learning should be considered for more complex tasks, as clinically a wealth of data is usually available and could be leveraged.
Collapse
Affiliation(s)
- L Pfänder
- Department of Oral Diagnostics, Digital Health and Health Services Research, Charité-Universitätsmedizin Berlin, 14197 Berlin, Germany
| | - L Schneider
- Department of Oral Diagnostics, Digital Health and Health Services Research, Charité-Universitätsmedizin Berlin, 14197 Berlin, Germany; ITU/WHO Focus Group AI4Health, Topic Group Dental Diagnostics and Digital Dentistry, Geneva, Switzerland
| | - M Büttner
- Department of Oral Diagnostics, Digital Health and Health Services Research, Charité-Universitätsmedizin Berlin, 14197 Berlin, Germany; ITU/WHO Focus Group AI4Health, Topic Group Dental Diagnostics and Digital Dentistry, Geneva, Switzerland
| | - J Krois
- ITU/WHO Focus Group AI4Health, Topic Group Dental Diagnostics and Digital Dentistry, Geneva, Switzerland
| | - H Meyer-Lueckel
- Department of Restorative, Preventive and Pediatric Dentistry, zmk Bern, University of Bern, Switzerland
| | - F Schwendicke
- Department of Oral Diagnostics, Digital Health and Health Services Research, Charité-Universitätsmedizin Berlin, 14197 Berlin, Germany; ITU/WHO Focus Group AI4Health, Topic Group Dental Diagnostics and Digital Dentistry, Geneva, Switzerland.
| |
Collapse
|
19
|
Chronological horse herd optimization-based gene selection with deep learning towards survival prediction using PAN-Cancer gene-expression data. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2023.104696] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/27/2023]
|
20
|
Kim S, Lee E. A deep attention LSTM embedded aggregation network for multiple histopathological images. PLoS One 2023; 18:e0287301. [PMID: 37384648 PMCID: PMC10310006 DOI: 10.1371/journal.pone.0287301] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Accepted: 06/03/2023] [Indexed: 07/01/2023] Open
Abstract
Recent advancements in computer vision and neural networks have facilitated the medical imaging survival analysis for various medical applications. However, challenges arise when patients have multiple images from multiple lesions, as current deep learning methods provide multiple survival predictions for each patient, complicating result interpretation. To address this issue, we developed a deep learning survival model that can provide accurate predictions at the patient level. We propose a deep attention long short-term memory embedded aggregation network (DALAN) for histopathology images, designed to simultaneously perform feature extraction and aggregation of lesion images. This design enables the model to efficiently learn imaging features from lesions and aggregate lesion-level information to the patient level. DALAN comprises a weight-shared CNN, attention layers, and LSTM layers. The attention layer calculates the significance of each lesion image, while the LSTM layer combines the weighted information to produce an all-encompassing representation of the patient's lesion data. Our proposed method performed better on both simulated and real data than other competing methods in terms of prediction accuracy. We evaluated DALAN against several naive aggregation methods on simulated and real datasets. Our results showed that DALAN outperformed the competing methods in terms of c-index on the MNIST and Cancer dataset simulations. On the real TCGA dataset, DALAN also achieved a higher c-index of 0.803±0.006 compared to the naive methods and the competing models. Our DALAN effectively aggregates multiple histopathology images, demonstrating a comprehensive survival model using attention and LSTM mechanisms.
Collapse
Affiliation(s)
- Sunghun Kim
- Department of Information and Statistics, Chungnam National University, Daejeon, Republic of Korea
- Department of Artificial Intelligence, Sungkyunkwan University, Suwon, Republic of Korea
| | - Eunjee Lee
- Department of Information and Statistics, Chungnam National University, Daejeon, Republic of Korea
| |
Collapse
|
21
|
Hao Y, Jing XY, Sun Q. Cancer survival prediction by learning comprehensive deep feature representation for multiple types of genetic data. BMC Bioinformatics 2023; 24:267. [PMID: 37380946 DOI: 10.1186/s12859-023-05392-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Accepted: 06/19/2023] [Indexed: 06/30/2023] Open
Abstract
BACKGROUND Cancer is one of the leading death causes around the world. Accurate prediction of its survival time is significant, which can help clinicians make appropriate therapeutic schemes. Cancer data can be characterized by varied molecular features, clinical behaviors and morphological appearances. However, the cancer heterogeneity problem usually makes patient samples with different risks (i.e., short and long survival time) inseparable, thereby causing unsatisfactory prediction results. Clinical studies have shown that genetic data tends to contain more molecular biomarkers associated with cancer, and hence integrating multi-type genetic data may be a feasible way to deal with cancer heterogeneity. Although multi-type gene data have been used in the existing work, how to learn more effective features for cancer survival prediction has not been well studied. RESULTS To this end, we propose a deep learning approach to reduce the negative impact of cancer heterogeneity and improve the cancer survival prediction effect. It represents each type of genetic data as the shared and specific features, which can capture the consensus and complementary information among all types of data. We collect mRNA expression, DNA methylation and microRNA expression data for four cancers to conduct experiments. CONCLUSIONS Experimental results demonstrate that our approach substantially outperforms established integrative methods and is effective for cancer survival prediction. AVAILABILITY AND IMPLEMENTATION https://github.com/githyr/ComprehensiveSurvival .
Collapse
Affiliation(s)
- Yaru Hao
- School of Computer Science, Wuhan University, Wuhan, China.
| | - Xiao-Yuan Jing
- School of Computer Science, Wuhan University, Wuhan, China.
- School of Computer, Guangdong University of Petrochemical Technology, Maoming, China.
- State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China.
| | - Qixing Sun
- School of Computer Science, Wuhan University, Wuhan, China
| |
Collapse
|
22
|
Padegal G, Rao MK, Boggaram Ravishankar OA, Acharya S, Athri P, Srinivasa G. Analysis of RNA-Seq data using self-supervised learning for vital status prediction of colorectal cancer patients. BMC Bioinformatics 2023; 24:241. [PMID: 37286944 DOI: 10.1186/s12859-023-05347-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2023] [Accepted: 05/21/2023] [Indexed: 06/09/2023] Open
Abstract
BACKGROUND RNA sequencing (RNA-Seq) is a technique that utilises the capabilities of next-generation sequencing to study a cellular transcriptome i.e., to determine the amount of RNA at a given time for a given biological sample. The advancement of RNA-Seq technology has resulted in a large volume of gene expression data for analysis. RESULTS Our computational model (built on top of TabNet) is first pretrained on an unlabelled dataset of multiple types of adenomas and adenocarcinomas and later fine-tuned on the labelled dataset, showing promising results in the context of the estimation of the vital status of colorectal cancer patients. We achieve a final cross-validated (ROC-AUC) Score of 0.88 by using multiple modalities of data. CONCLUSION The results of this study demonstrate that self-supervised learning methods pretrained on a vast corpus of unlabelled data outperform traditional supervised learning methods such as XGBoost, Neural Networks, and Decision Trees that have been prevalent in the tabular domain. The results of this study are further boosted by the inclusion of multiple modalities of data pertaining to the patients in question. We find that genes such as RBM3, GSPT1, MAD2L1, and others important to the computation model's prediction task obtained through model interpretability corroborate with pathological evidence in current literature.
Collapse
Affiliation(s)
- Girivinay Padegal
- PES Center for Pattern Recognition, Department of Computer Science and Engineering, PES University, Bengaluru, 560085, India
| | - Murali Krishna Rao
- PES Center for Pattern Recognition, Department of Computer Science and Engineering, PES University, Bengaluru, 560085, India
| | - Om Amitesh Boggaram Ravishankar
- PES Center for Pattern Recognition, Department of Computer Science and Engineering, PES University, Bengaluru, 560085, India
| | - Sathwik Acharya
- PES Center for Pattern Recognition, Department of Computer Science and Engineering, PES University, Bengaluru, 560085, India
| | - Prashanth Athri
- Department of Computer Science and Engineering, PES University Electronic City Campus, Bengaluru, 560100, India
| | - Gowri Srinivasa
- PES Center for Pattern Recognition, Department of Computer Science and Engineering, PES University, Bengaluru, 560085, India.
| |
Collapse
|
23
|
Altuhaifa F. Time Series Prediction of Lung Cancer Death Rates on the Basis of SEER Data. JCO Clin Cancer Inform 2023; 7:e2300011. [PMID: 37311162 DOI: 10.1200/cci.23.00011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2023] [Revised: 03/30/2023] [Accepted: 04/19/2023] [Indexed: 06/15/2023] Open
Abstract
PURPOSE The purpose of this study was to apply different time series analytical techniques to SEER US lung cancer death rate data to develop a best fit model. METHODS Three models for yearly time series predictions were built: autoregressive integrated moving average (ARIMA), simple exponential smoothing (SES), and Holt's double expansional smoothing (HDES) models. The three models were built using Python 3.9, on the basis of Anaconda 2022.10. RESULTS This study used SEER data from 1975 to 2018 and included 545,486 patients with lung cancer. The best parameters for ARIMA are ARIMA (p, d, q) = (0, 2, 2). In addition, the best parameter for SES was α = .995, whereas the best parameters for HDES were α = .4 and β = .9. The HDES was the model that best fit the lung cancer death rate data, with a root mean square error (RMSE) of 132.91. CONCLUSION Including monthly diagnoses, death rates, and years in SEER data increases the number of observations for training and test sets, enhancing the performance of time series models. The reliability of the RMSE was based on the mean lung cancer mortality rate. Owing to the high mean lung cancer death rate of 8,405 patients per year, it is acceptable for reliable models to have large RMSEs.
Collapse
Affiliation(s)
- Fatimah Altuhaifa
- School of Computing and Information Technology, University of Wollongong, Wollongong, New South Wales, Australia
| |
Collapse
|
24
|
Olatunji I, Cui F. Multimodal AI for prediction of distant metastasis in carcinoma patients. FRONTIERS IN BIOINFORMATICS 2023; 3:1131021. [PMID: 37228671 PMCID: PMC10203594 DOI: 10.3389/fbinf.2023.1131021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2022] [Accepted: 04/24/2023] [Indexed: 05/27/2023] Open
Abstract
Metastasis of cancer is directly related to death in almost all cases, however a lot is yet to be understood about this process. Despite advancements in the available radiological investigation techniques, not all cases of Distant Metastasis (DM) are diagnosed at initial clinical presentation. Also, there are currently no standard biomarkers of metastasis. Early, accurate diagnosis of DM is however crucial for clinical decision making, and planning of appropriate management strategies. Previous works have achieved little success in attempts to predict DM from either clinical, genomic, radiology, or histopathology data. In this work we attempt a multimodal approach to predict the presence of DM in cancer patients by combining gene expression data, clinical data and histopathology images. We tested a novel combination of Random Forest (RF) algorithm with an optimization technique for gene selection, and investigated if gene expression pattern in the primary tissues of three cancer types (Bladder Carcinoma, Pancreatic Adenocarcinoma, and Head and Neck Squamous Carcinoma) with DM are similar or different. Gene expression biomarkers of DM identified by our proposed method outperformed Differentially Expressed Genes (DEGs) identified by the DESeq2 software package in the task of predicting presence or absence of DM. Genes involved in DM tend to be more cancer type specific rather than general across all cancers. Our results also indicate that multimodal data is more predictive of metastasis than either of the three unimodal data tested, and genomic data provides the highest contribution by a wide margin. The results re-emphasize the importance for availability of sufficient image data when a weakly supervised training technique is used. Code is made available at: https://github.com/rit-cui-lab/Multimodal-AI-for-Prediction-of-Distant-Metastasis-in-Carcinoma-Patients.
Collapse
|
25
|
Fan L, Sowmya A, Meijering E, Song Y. Cancer Survival Prediction From Whole Slide Images With Self-Supervised Learning and Slide Consistency. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:1401-1412. [PMID: 37015696 DOI: 10.1109/tmi.2022.3228275] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Histopathological Whole Slide Images (WSIs) at giga-pixel resolution are the gold standard for cancer analysis and prognosis. Due to the scarcity of pixel- or patch-level annotations of WSIs, many existing methods attempt to predict survival outcomes based on a three-stage strategy that includes patch selection, patch-level feature extraction and aggregation. However, the patch features are usually extracted by using truncated models (e.g. ResNet) pretrained on ImageNet without fine-tuning on WSI tasks, and the aggregation stage does not consider the many-to-one relationship between multiple WSIs and the patient. In this paper, we propose a novel survival prediction framework that consists of patch sampling, feature extraction and patient-level survival prediction. Specifically, we employ two kinds of self-supervised learning methods, i.e. colorization and cross-channel, as pretext tasks to train convnet-based models that are tailored for extracting features from WSIs. Then, at the patient-level survival prediction we explicitly aggregate features from multiple WSIs, using consistency and contrastive losses to normalize slide-level features at the patient level. We conduct extensive experiments on three large-scale datasets: TCGA-GBM, TCGA-LUSC and NLST. Experimental results demonstrate the effectiveness of our proposed framework, as it achieves state-of-the-art performance in comparison with previous studies, with concordance index of 0.670, 0.679 and 0.711 on TCGA-GBM, TCGA-LUSC and NLST, respectively.
Collapse
|
26
|
Wissel D, Rowson D, Boeva V. Systematic comparison of multi-omics survival models reveals a widespread lack of noise resistance. CELL REPORTS METHODS 2023; 3:100461. [PMID: 37159669 PMCID: PMC10162996 DOI: 10.1016/j.crmeth.2023.100461] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/14/2022] [Revised: 02/01/2023] [Accepted: 03/30/2023] [Indexed: 05/11/2023]
Abstract
As observed in several previous studies, integrating more molecular modalities in multi-omics cancer survival models may not always improve model accuracy. In this study, we compared eight deep learning and four statistical integration techniques for survival prediction on 17 multi-omics datasets, examining model performance in terms of overall accuracy and noise resistance. We found that one deep learning method, mean late fusion, and two statistical methods, PriorityLasso and BlockForest, performed best in terms of both noise resistance and overall discriminative and calibration performance. Nevertheless, all methods struggled to adequately handle noise when too many modalities were added. In summary, we confirmed that current multi-omics survival methods are not sufficiently noise resistant. We recommend relying on only modalities for which there is known predictive value for a particular cancer type until models that have stronger noise-resistance properties are developed.
Collapse
Affiliation(s)
- David Wissel
- ETH Zurich, Department of Computer Science, Zurich, Switzerland
- University of Zurich, Department of Molecular Life Sciences, Zurich, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Daniel Rowson
- ETH Zurich, Department of Computer Science, Zurich, Switzerland
| | - Valentina Boeva
- ETH Zurich, Department of Computer Science, Zurich, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Université de Paris UMR-S1016, Institut Cochin, Inserm U1016, Paris, France
- Corresponding author
| |
Collapse
|
27
|
Susič D, Syed-Abdul S, Dovgan E, Jonnagaddala J, Gradišek A. Artificial intelligence based personalized predictive survival among colorectal cancer patients. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 231:107435. [PMID: 36842345 DOI: 10.1016/j.cmpb.2023.107435] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/24/2022] [Revised: 12/14/2022] [Accepted: 02/18/2023] [Indexed: 06/18/2023]
Abstract
BACKGROUND AND OBJECTIVE Colorectal cancer is a major health concern. It is now the third most common cancer and the fourth leading cause of cancer mortality worldwide. The aim of this study was to evaluate the performance of machine learning algorithms for predicting survival of colorectal cancer patients 1 to 5 years after diagnosis, and identify the most important variables. METHODS A sample of 1236 patients diagnosed with colorectal cancer and 118 predictor variables has been used. The outcome of interest was a binary variable indicating whether the patient survived the number of years in question or not. 20 predictor variables were selected using mutual information score with the outcome. We implemented 11 machine learning algorithms and evaluated their performance with a 5 by 2-fold cross-validation with stratified folds and with paired Student's t-tests. We compared the results with the Kaplan-Meier estimator and Cox's proportional hazard regression. RESULTS Using the 20 most important predictor variables for each of the survival years, the logistic regression algorithm achieved an area under the receiver operating characteristic curve of 0.850 (0.014 SD, 0.840-0.860 95 % CI) for the 1-year, and 0.872 (0.014 SD, 0.861-0.882 95% CI) for the 5-year survival prediction. Using only the 5 most important predictor variables, the corresponding values are 0.793 (0.020 SD, 0.778-0.807 95% CI) and 0.794 (0.011 SD, 0.785-0.802 95% CI). The most important variables for 1-year prediction were number of R residual, M distant metastasis, overall stage, probable recurrence within 5 years, and tumour length, whereas for 5-year prediction the most important were probable recurrence within 5 years, R residual, M distant metastasis, number of positive lymph nodes, and palliative chemotherapy. Biomarkers do not appear among the top 20 most important ones. For all survival intervals, the probability of the top model agrees with the Kaplan-Meier estimate, both in the interval of one standard deviation and in the 95% confidence interval. CONCLUSIONS The findings suggest that machine learning algorithms can predict the survival probability of colorectal cancer patients and can be used to inform the patients and assist decision-making in clinical care management. In addition, this study unveils the most essential variables for estimating survival short- and long-term among patients with Colorectal cancer.
Collapse
Affiliation(s)
- David Susič
- Jožef Stefan Institute, Jamova cesta 39, SI-1000 Ljubljana, Slovenia; Jožef Stefan International Postgraduate School, Jamova cesta 39, SI-1000 Ljubljana, Slovenia
| | - Shabbir Syed-Abdul
- Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei 110, Taiwan.
| | - Erik Dovgan
- Jožef Stefan Institute, Jamova cesta 39, SI-1000 Ljubljana, Slovenia
| | | | - Anton Gradišek
- Jožef Stefan Institute, Jamova cesta 39, SI-1000 Ljubljana, Slovenia.
| |
Collapse
|
28
|
Milewski D, Jung H, Brown GT, Liu Y, Somerville B, Lisle C, Ladanyi M, Rudzinski ER, Choo-Wosoba H, Barkauskas DA, Lo T, Hall D, Linardic CM, Wei JS, Chou HC, Skapek SX, Venkatramani R, Bode PK, Steinberg SM, Zaki G, Kuznetsov IB, Hawkins DS, Shern JF, Collins J, Khan J. Predicting Molecular Subtype and Survival of Rhabdomyosarcoma Patients Using Deep Learning of H&E Images: A Report from the Children's Oncology Group. Clin Cancer Res 2023; 29:364-378. [PMID: 36346688 PMCID: PMC9843436 DOI: 10.1158/1078-0432.ccr-22-1663] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2022] [Revised: 08/01/2022] [Accepted: 11/02/2022] [Indexed: 11/09/2022]
Abstract
PURPOSE Rhabdomyosarcoma (RMS) is an aggressive soft-tissue sarcoma, which primarily occurs in children and young adults. We previously reported specific genomic alterations in RMS, which strongly correlated with survival; however, predicting these mutations or high-risk disease at diagnosis remains a significant challenge. In this study, we utilized convolutional neural networks (CNN) to learn histologic features associated with driver mutations and outcome using hematoxylin and eosin (H&E) images of RMS. EXPERIMENTAL DESIGN Digital whole slide H&E images were collected from clinically annotated diagnostic tumor samples from 321 patients with RMS enrolled in Children's Oncology Group (COG) trials (1998-2017). Patches were extracted and fed into deep learning CNNs to learn features associated with mutations and relative event-free survival risk. The performance of the trained models was evaluated against independent test sample data (n = 136) or holdout test data. RESULTS The trained CNN could accurately classify alveolar RMS, a high-risk subtype associated with PAX3/7-FOXO1 fusion genes, with an ROC of 0.85 on an independent test dataset. CNN models trained on mutationally-annotated samples identified tumors with RAS pathway with a ROC of 0.67, and high-risk mutations in MYOD1 or TP53 with a ROC of 0.97 and 0.63, respectively. Remarkably, CNN models were superior in predicting event-free and overall survival compared with current molecular-clinical risk stratification. CONCLUSIONS This study demonstrates that high-risk features, including those associated with certain mutations, can be readily identified at diagnosis using deep learning. CNNs are a powerful tool for diagnostic and prognostic prediction of rhabdomyosarcoma, which will be tested in prospective COG clinical trials.
Collapse
Affiliation(s)
| | - Hyun Jung
- Advanced Biomedical Computational Science, Frederick National Laboratory for Cancer Research, Frederick, Maryland
| | - G. Thomas Brown
- Advanced Biomedical Computational Science, Frederick National Laboratory for Cancer Research, Frederick, Maryland
- Artificial Intelligence Resource, NCI, NIH, Bethesda, Maryland
| | - Yanling Liu
- Advanced Biomedical Computational Science, Frederick National Laboratory for Cancer Research, Frederick, Maryland
| | | | - Curtis Lisle
- Advanced Biomedical Computational Science, Frederick National Laboratory for Cancer Research, Frederick, Maryland
- KnowledgeVis, LLC, Altamonte Springs, Florida
| | - Marc Ladanyi
- Department of Pathology, Memorial Sloan-Kettering Cancer Center, New York, New York
| | - Erin R. Rudzinski
- Department of Laboratories, Seattle Children's Hospital, Seattle, Washington
| | - Hyoyoung Choo-Wosoba
- Biostatistics and Data Management Section, Keck School of Medicine of the University of Southern California, Los Angeles, California
| | - Donald A. Barkauskas
- Department of Population and Public Health Sciences, Keck School of Medicine of the University of Southern California, Los Angeles, California
- Children's Oncology Group, Monrovia, California
| | - Tammy Lo
- Children's Oncology Group, Monrovia, California
| | - David Hall
- Children's Oncology Group, Monrovia, California
| | - Corinne M. Linardic
- Departments of Pediatrics and Pharmacology & Cancer Biology, Duke University School of Medicine, Durham, North Carolina
| | - Jun S. Wei
- Genetics Branch, NCI, NIH, Bethesda, Maryland
| | | | - Stephen X. Skapek
- Department of Pediatrics, Division of Hematology/Oncology, University of Texas Southwestern Medical Center, Dallas, Texas
| | - Rajkumar Venkatramani
- Division of Hematology/Oncology, Texas Children's Cancer Center, Baylor College of Medicine, Houston, Texas
| | - Peter K. Bode
- Institut für Pathologie, Kantonsspital Winterthur, Winterthur, Switzerland
| | - Seth M. Steinberg
- Biostatistics and Data Management Section, Keck School of Medicine of the University of Southern California, Los Angeles, California
| | - George Zaki
- Cancer Research Technology Program, Frederick National Laboratory for Cancer Research, Frederick, Maryland
| | - Igor B. Kuznetsov
- Department of Epidemiology & Biostatistics, School of Public Health, University at Albany, Rensselaer, New York
| | - Douglas S. Hawkins
- Chair of Children's Oncology Group, Department of Pediatrics, Seattle Children's Hospital, Fred Hutchinson Cancer Research Center, University of Washington, Seattle, Washington
| | - Jack F. Shern
- Pediatric Oncology Branch, Center for Cancer Research, NIH, Bethesda, Maryland
| | - Jack Collins
- Advanced Biomedical Computational Science, Frederick National Laboratory for Cancer Research, Frederick, Maryland
| | - Javed Khan
- Genetics Branch, NCI, NIH, Bethesda, Maryland
| |
Collapse
|
29
|
Alleman K, Knecht E, Huang J, Zhang L, Lam S, DeCuypere M. Multimodal Deep Learning-Based Prognostication in Glioma Patients: A Systematic Review. Cancers (Basel) 2023; 15:cancers15020545. [PMID: 36672494 PMCID: PMC9856816 DOI: 10.3390/cancers15020545] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Revised: 01/05/2023] [Accepted: 01/08/2023] [Indexed: 01/18/2023] Open
Abstract
Malignant brain tumors pose a substantial burden on morbidity and mortality. As clinical data collection improves, along with the capacity to analyze it, novel predictive clinical tools may improve prognosis prediction. Deep learning (DL) holds promise for integrating clinical data of various modalities. A systematic review of the DL-based prognostication of gliomas was performed using the Embase (Elsevier), PubMed MEDLINE (National library of Medicine), and Scopus (Elsevier) databases, in accordance with PRISMA guidelines. All included studies focused on the prognostication of gliomas, and predicted overall survival (13 studies, 81%), overall survival as well as genotype (2 studies, 12.5%), and response to immunotherapy (1 study, 6.2%). Multimodal analyses were varied, with 6 studies (37.5%) combining MRI with clinical data; 6 studies (37.5%) integrating MRI with histologic, clinical, and biomarker data; 3 studies (18.8%) combining MRI with genomic data; and 1 study (6.2%) combining histologic imaging with clinical data. Studies that compared multimodal models to unimodal-only models demonstrated improved predictive performance. The risk of bias was mixed, most commonly due to inconsistent methodological reporting. Overall, the use of multimodal data in DL assessments of gliomas leads to a more accurate overall survival prediction. However, due to data limitations and a lack of transparency in model and code reporting, the full extent of multimodal DL as a resource for brain tumor patients has not yet been realized.
Collapse
Affiliation(s)
- Kaitlyn Alleman
- Chicago Medical School, Rosalind Franklin University of Science and Medicine, Chicago, IL 60064, USA
| | - Erik Knecht
- Chicago Medical School, Rosalind Franklin University of Science and Medicine, Chicago, IL 60064, USA
| | - Jonathan Huang
- Division of Pediatric Neurosurgery, Ann and Robert H. Lurie Children’s Hospital of Chicago, Chicago, IL 60611, USA
| | - Lu Zhang
- Division of Pediatric Neurosurgery, Ann and Robert H. Lurie Children’s Hospital of Chicago, Chicago, IL 60611, USA
| | - Sandi Lam
- Division of Pediatric Neurosurgery, Ann and Robert H. Lurie Children’s Hospital of Chicago, Chicago, IL 60611, USA
- Department of Neurological Surgery, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA
- Malnati Brain Tumor Institute of the Lurie Comprehensive Cancer Center, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA
| | - Michael DeCuypere
- Division of Pediatric Neurosurgery, Ann and Robert H. Lurie Children’s Hospital of Chicago, Chicago, IL 60611, USA
- Department of Neurological Surgery, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA
- Malnati Brain Tumor Institute of the Lurie Comprehensive Cancer Center, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA
- Correspondence:
| |
Collapse
|
30
|
Han Y, Pan F, Song H, Luo R, Li C, Pi H, Wang J, Li T. Intelligent injury prediction for traumatic airway obstruction. Med Biol Eng Comput 2023; 61:139-153. [PMID: 36331757 DOI: 10.1007/s11517-022-02706-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2022] [Accepted: 10/22/2022] [Indexed: 11/07/2022]
Abstract
Airway obstruction is one of the crucial causes of death in trauma patients during the first aid. It is extremely challenging to accurately treat a great deal of casualties with airway obstruction in hospitals. The diagnosis of airway obstruction in an emergency mostly relies on the medical experience of physicians. In this paper, we propose the feature selection approach genetic algorithm-mean decrease impurity (GA-MDI) to effectively minimize the number of features as well as ensure the accuracy of prediction. Furthermore, we design a multi-modal neural network, called fully convolutional network with squeeze-and-excitation and multilayer perceptron (FCN-SE + MLP), to help physicians to predict the severity of airway obstruction. We validate the effectiveness of the proposed feature selection approach and multi-modal model on the emergency medical database from the Chinese General Hospital of the PLA. The experimental results show that GA-MDI outperforms the existing feature selection algorithms, while it is also validated that the model FCN-SE + MLP can effectively and accurately achieve the prediction of the severity of airway obstruction, which can assist clinicians in making treatment decisions for airway obstruction casualties.
Collapse
Affiliation(s)
- Youfang Han
- School of Software, Tsinghua University, Beijing, China
| | - Fei Pan
- Emergency Department, The First Medical Center of PLA General Hospital, Beijing, China
| | - Hainan Song
- Emergency Department, The First Medical Center of PLA General Hospital, Beijing, China
| | - Ruihong Luo
- School of Software, Tsinghua University, Beijing, China
| | - Chunping Li
- School of Software, Tsinghua University, Beijing, China.
| | - Hongying Pi
- Nursing Department, PLA General Hospital, Beijing, China.
| | - Jianrong Wang
- Nursing Department, PLA General Hospital, Beijing, China.
| | - Tanshi Li
- Emergency Department, The First Medical Center of PLA General Hospital, Beijing, China
| |
Collapse
|
31
|
Interpretable prognostic modeling of endometrial cancer. Sci Rep 2022; 12:21543. [PMID: 36513790 PMCID: PMC9747711 DOI: 10.1038/s41598-022-26134-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2022] [Accepted: 12/09/2022] [Indexed: 12/15/2022] Open
Abstract
Endometrial carcinoma (EC) is one of the most common gynecological cancers in the world. In this work we apply Cox proportional hazards (CPH) and optimal survival tree (OST) algorithms to the retrospective prognostic modeling of disease-specific survival in 842 EC patients. We demonstrate that linear CPH models are preferred for the EC risk assessment based on clinical features alone, while interpretable, non-linear OST models are favored when patient profiles can be supplemented with additional biomarker data. We show how visually interpretable tree models can help generate and explore novel research hypotheses by studying the OST decision path structure, in which L1 cell adhesion molecule expression and estrogen receptor status are correctly indicated as important risk factors in the p53 abnormal EC subgroup. To aid further clinical adoption of advanced machine learning techniques, we stress the importance of quantifying model discrimination and calibration performance in the development of explainable clinical prediction models.
Collapse
|
32
|
A Multimodal Ensemble Driven by Multiobjective Optimisation to Predict Overall Survival in Non-Small-Cell Lung Cancer. J Imaging 2022; 8:jimaging8110298. [DOI: 10.3390/jimaging8110298] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Revised: 10/28/2022] [Accepted: 10/30/2022] [Indexed: 11/06/2022] Open
Abstract
Lung cancer accounts for more deaths worldwide than any other cancer disease. In order to provide patients with the most effective treatment for these aggressive tumours, multimodal learning is emerging as a new and promising field of research that aims to extract complementary information from the data of different modalities for prognostic and predictive purposes. This knowledge could be used to optimise current treatments and maximise their effectiveness. To predict overall survival, in this work, we investigate the use of multimodal learning on the CLARO dataset, which includes CT images and clinical data collected from a cohort of non-small-cell lung cancer patients. Our method allows the identification of the optimal set of classifiers to be included in the ensemble in a late fusion approach. Specifically, after training unimodal models on each modality, it selects the best ensemble by solving a multiobjective optimisation problem that maximises both the recognition performance and the diversity of the predictions. In the ensemble, the labels of each sample are assigned using the majority voting rule. As further validation, we show that the proposed ensemble outperforms the models learning a single modality, obtaining state-of-the-art results on the task at hand.
Collapse
|
33
|
Lipkova J, Chen RJ, Chen B, Lu MY, Barbieri M, Shao D, Vaidya AJ, Chen C, Zhuang L, Williamson DFK, Shaban M, Chen TY, Mahmood F. Artificial intelligence for multimodal data integration in oncology. Cancer Cell 2022; 40:1095-1110. [PMID: 36220072 PMCID: PMC10655164 DOI: 10.1016/j.ccell.2022.09.012] [Citation(s) in RCA: 85] [Impact Index Per Article: 42.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/09/2022] [Revised: 07/12/2022] [Accepted: 09/15/2022] [Indexed: 02/07/2023]
Abstract
In oncology, the patient state is characterized by a whole spectrum of modalities, ranging from radiology, histology, and genomics to electronic health records. Current artificial intelligence (AI) models operate mainly in the realm of a single modality, neglecting the broader clinical context, which inevitably diminishes their potential. Integration of different data modalities provides opportunities to increase robustness and accuracy of diagnostic and prognostic models, bringing AI closer to clinical practice. AI models are also capable of discovering novel patterns within and across modalities suitable for explaining differences in patient outcomes or treatment resistance. The insights gleaned from such models can guide exploration studies and contribute to the discovery of novel biomarkers and therapeutic targets. To support these advances, here we present a synopsis of AI methods and strategies for multimodal data fusion and association discovery. We outline approaches for AI interpretability and directions for AI-driven exploration through multimodal data interconnections. We examine challenges in clinical adoption and discuss emerging solutions.
Collapse
Affiliation(s)
- Jana Lipkova
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA; Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA; Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Richard J Chen
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA; Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA; Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA; Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Bowen Chen
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA; Department of Computer Science, Harvard University, Cambridge, MA, USA
| | - Ming Y Lu
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA; Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA; Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA; Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology (MIT), Cambridge, MA, USA
| | - Matteo Barbieri
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Daniel Shao
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA; Harvard-MIT Health Sciences and Technology (HST), Cambridge, MA, USA
| | - Anurag J Vaidya
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA; Harvard-MIT Health Sciences and Technology (HST), Cambridge, MA, USA
| | - Chengkuan Chen
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA; Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA; Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Luoting Zhuang
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Drew F K Williamson
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA; Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA; Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Muhammad Shaban
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA; Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA; Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Tiffany Y Chen
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA; Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA; Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Faisal Mahmood
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA; Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA; Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA; Harvard Data Science Initiative, Harvard University, Cambridge, MA, USA.
| |
Collapse
|
34
|
Choi SR, Lee M. Estimating the Prognosis of Low-Grade Glioma with Gene Attention Using Multi-Omics and Multi-Modal Schemes. BIOLOGY 2022; 11:biology11101462. [PMID: 36290366 PMCID: PMC9598836 DOI: 10.3390/biology11101462] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Revised: 10/01/2022] [Accepted: 10/02/2022] [Indexed: 11/20/2022]
Abstract
The prognosis estimation of low-grade glioma (LGG) patients with deep learning models using gene expression data has been extensively studied in recent years. However, the deep learning models used in these studies do not utilize the latest deep learning techniques, such as residual learning and ensemble learning. To address this limitation, in this study, a deep learning model using multi-omics and multi-modal schemes, namely the Multi-Prognosis Estimation Network (Multi-PEN), is proposed. When using Multi-PEN, gene attention layers are employed for each datatype, including mRNA and miRNA, thereby allowing us to identify prognostic genes. Additionally, recent developments in deep learning, such as residual learning and layer normalization, are utilized. As a result, Multi-PEN demonstrates competitive performance compared to conventional models for prognosis estimation. Furthermore, the most significant prognostic mRNA and miRNA were identified using the attention layers in Multi-PEN. For instance, MYBL1 was identified as the most significant prognostic mRNA. Such a result accords with the findings in existing studies that have demonstrated that MYBL1 regulates cell survival, proliferation, and differentiation. Additionally, hsa-mir-421 was identified as the most significant prognostic miRNA, and it has been extensively reported that hsa-mir-421 is highly associated with various cancers. These results indicate that the estimations of Multi-PEN are valid and reliable and showcase Multi-PEN's capacity to present hypotheses regarding prognostic mRNAs and miRNAs.
Collapse
|
35
|
Hou J, Jia X, Xie Y, Qin W. Integrative Histology-Genomic Analysis Predicts Hepatocellular Carcinoma Prognosis Using Deep Learning. Genes (Basel) 2022; 13:genes13101770. [PMID: 36292654 PMCID: PMC9601633 DOI: 10.3390/genes13101770] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Revised: 09/25/2022] [Accepted: 09/28/2022] [Indexed: 11/04/2022] Open
Abstract
Cancer prognosis analysis is of essential interest in clinical practice. In order to explore the prognostic power of computational histopathology and genomics, this paper constructs a multi-modality prognostic model for survival prediction. We collected 346 patients diagnosed with hepatocellular carcinoma (HCC) from The Cancer Genome Atlas (TCGA), each patient has 1-3 whole slide images (WSIs) and an mRNA expression file. WSIs were processed by a multi-instance deep learning model to obtain the patient-level survival risk scores; mRNA expression data were processed by weighted gene co-expression network analysis (WGCNA), and the top hub genes of each module were extracted as risk factors. Information from two modalities was integrated by Cox proportional hazard model to predict patient outcomes. The overall survival predictions of the multi-modality model (Concordance index (C-index): 0.746, 95% confidence interval (CI): ±0.077) outperformed these based on histopathology risk score or hub genes, respectively. Furthermore, in the prediction of 1-year and 3-year survival, the area under curve of the model achieved 0.816 and 0.810. In conclusion, this paper provides an effective workflow for multi-modality prognosis of HCC, the integration of histopathology and genomic information has the potential to assist clinical prognosis management.
Collapse
Affiliation(s)
- Jiaxin Hou
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
- Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences, Shenzhen 518055, China
| | - Xiaoqi Jia
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
- School of Computer Science and Engineering, Northeastern University, Shenyang 110819, China
| | - Yaoqin Xie
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Wenjian Qin
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
- Correspondence:
| |
Collapse
|
36
|
Risk Stratification for Breast Cancer Patient by Simultaneous Learning of Molecular Subtype and Survival Outcome Using Genetic Algorithm-Based Gene Set Selection. Cancers (Basel) 2022; 14:cancers14174120. [PMID: 36077657 PMCID: PMC9454699 DOI: 10.3390/cancers14174120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2022] [Revised: 08/18/2022] [Accepted: 08/20/2022] [Indexed: 11/26/2022] Open
Abstract
Simple Summary Patient stratification is clinically important because it allows us to understand the characteristics and establish treatment strategies for a group. Transcriptomic data play an important role in determining molecular subtypes and predicting survival. In the case of breast cancer, although the order of prognosis according to molecular subtypes is well known, there is heterogeneity even within a subtype. Therefore, patient stratification considering both molecular subtypes and survival outcomes is required. In this study, a methodology to handle this problem is presented. A genetic algorithm is used to select a set of genes, and a risk score is assigned to each patient using their expression level. According to the risk score, patients are ordered and stratified considering molecular subtypes and survival outcomes. Consequently, informative genes for patient stratification with respect to both aspects could be nominated, and the usefulness of the risk score was shown through comparison with other indicators. Abstract Patient stratification is a clinically important task because it allows us to establish and develop efficient treatment strategies for particular groups of patients. Molecular subtypes have been successfully defined using transcriptomic profiles, and they are used effectively in clinical practice, e.g., PAM50 subtypes of breast cancer. Survival prediction contributed to understanding diseases and also identifying genes related to prognosis. It is desirable to stratify patients considering these two aspects simultaneously. However, there are no methods for patient stratification that consider molecular subtypes and survival outcomes at once. Here, we propose a methodology to deal with the problem. A genetic algorithm is used to select a gene set from transcriptome data, and their expression quantities are utilized to assign a risk score to each patient. The patients are ordered and stratified according to the score. A gene set was selected by our method on a breast cancer cohort (TCGA-BRCA), and we examined its clinical utility using an independent cohort (SCAN-B). In this experiment, our method was successful in stratifying patients with respect to both molecular subtype and survival outcome. We demonstrated that the orders of patients were consistent across repeated experiments, and prognostic genes were successfully nominated. Additionally, it was observed that the risk score can be used to evaluate the molecular aggressiveness of individual patients.
Collapse
|
37
|
Fremond S, Koelzer VH, Horeweg N, Bosse T. The evolving role of morphology in endometrial cancer diagnostics: From histopathology and molecular testing towards integrative data analysis by deep learning. Front Oncol 2022; 12:928977. [PMID: 36059702 PMCID: PMC9433878 DOI: 10.3389/fonc.2022.928977] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Accepted: 07/15/2022] [Indexed: 11/13/2022] Open
Abstract
Endometrial cancer (EC) diagnostics is evolving into a system in which molecular aspects are increasingly important. The traditional histological subtype-driven classification has shifted to a molecular-based classification that stratifies EC into DNA polymerase epsilon mutated (POLEmut), mismatch repair deficient (MMRd), and p53 abnormal (p53abn), and the remaining EC as no specific molecular profile (NSMP). The molecular EC classification has been implemented in the World Health Organization 2020 classification and the 2021 European treatment guidelines, as it serves as a better basis for patient management. As a result, the integration of the molecular class with histopathological variables has become a critical focus of recent EC research. Pathologists have observed and described several morphological characteristics in association with specific genomic alterations, but these appear insufficient to accurately classify patients according to molecular subgroups. This requires pathologists to rely on molecular ancillary tests in routine workup. In this new era, it has become increasingly challenging to assign clinically relevant weights to histological and molecular features on an individual patient basis. Deep learning (DL) technology opens new options for the integrative analysis of multi-modal image and molecular datasets with clinical outcomes. Proof-of-concept studies in other cancers showed promising accuracy in predicting molecular alterations from H&E-stained tumor slide images. This suggests that some morphological characteristics that are associated with molecular alterations could be identified in EC, too, expanding the current understanding of the molecular-driven EC classification. Here in this review, we report the morphological characteristics of the molecular EC classification currently identified in the literature. Given the new challenges in EC diagnostics, this review discusses, therefore, the potential supportive role that DL could have, by providing an outlook on all relevant studies using DL on histopathology images in various cancer types with a focus on EC. Finally, we touch upon how DL might shape the management of future EC patients.
Collapse
Affiliation(s)
- Sarah Fremond
- Department of Pathology, Leiden University Medical Center (LUMC), Leiden, Netherlands
| | - Viktor Hendrik Koelzer
- Department of Pathology and Molecular Pathology, University Hospital and University of Zürich, Zürich, Switzerland
| | - Nanda Horeweg
- Department of Radiotherapy, Leiden University Medical Center, Leiden, Netherlands
| | - Tjalling Bosse
- Department of Pathology, Leiden University Medical Center (LUMC), Leiden, Netherlands
- *Correspondence: Tjalling Bosse,
| |
Collapse
|
38
|
Chen RJ, Lu MY, Williamson DFK, Chen TY, Lipkova J, Noor Z, Shaban M, Shady M, Williams M, Joo B, Mahmood F. Pan-cancer integrative histology-genomic analysis via multimodal deep learning. Cancer Cell 2022; 40:865-878.e6. [PMID: 35944502 PMCID: PMC10397370 DOI: 10.1016/j.ccell.2022.07.004] [Citation(s) in RCA: 76] [Impact Index Per Article: 38.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/24/2021] [Revised: 10/08/2021] [Accepted: 07/11/2022] [Indexed: 02/07/2023]
Abstract
The rapidly emerging field of computational pathology has demonstrated promise in developing objective prognostic models from histology images. However, most prognostic models are either based on histology or genomics alone and do not address how these data sources can be integrated to develop joint image-omic prognostic models. Additionally, identifying explainable morphological and molecular descriptors from these models that govern such prognosis is of interest. We use multimodal deep learning to jointly examine pathology whole-slide images and molecular profile data from 14 cancer types. Our weakly supervised, multimodal deep-learning algorithm is able to fuse these heterogeneous modalities to predict outcomes and discover prognostic features that correlate with poor and favorable outcomes. We present all analyses for morphological and molecular correlates of patient prognosis across the 14 cancer types at both a disease and a patient level in an interactive open-access database to allow for further exploration, biomarker discovery, and feature assessment.
Collapse
Affiliation(s)
- Richard J Chen
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Department of Pathology, Mass General Hospital, Harvard Medical School, Boston, MA, USA; Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA; Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA; Cancer Data Science Program, Dana-Farber/Harvard Cancer Institute, Boston, MA, USA
| | - Ming Y Lu
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Department of Pathology, Mass General Hospital, Harvard Medical School, Boston, MA, USA; Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA; Cancer Data Science Program, Dana-Farber/Harvard Cancer Institute, Boston, MA, USA; Department of Electrical Engineering and Computer Science, MIT, Cambridge, MA, USA
| | - Drew F K Williamson
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Department of Pathology, Mass General Hospital, Harvard Medical School, Boston, MA, USA; Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA; Cancer Data Science Program, Dana-Farber/Harvard Cancer Institute, Boston, MA, USA
| | - Tiffany Y Chen
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA; Cancer Data Science Program, Dana-Farber/Harvard Cancer Institute, Boston, MA, USA
| | - Jana Lipkova
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA; Cancer Data Science Program, Dana-Farber/Harvard Cancer Institute, Boston, MA, USA
| | - Zahra Noor
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Muhammad Shaban
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Department of Pathology, Mass General Hospital, Harvard Medical School, Boston, MA, USA; Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA; Cancer Data Science Program, Dana-Farber/Harvard Cancer Institute, Boston, MA, USA
| | - Maha Shady
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA; Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA; Cancer Data Science Program, Dana-Farber/Harvard Cancer Institute, Boston, MA, USA
| | - Mane Williams
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Department of Pathology, Mass General Hospital, Harvard Medical School, Boston, MA, USA; Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA; Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA; Cancer Data Science Program, Dana-Farber/Harvard Cancer Institute, Boston, MA, USA
| | - Bumjin Joo
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Faisal Mahmood
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Department of Pathology, Mass General Hospital, Harvard Medical School, Boston, MA, USA; Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA; Cancer Data Science Program, Dana-Farber/Harvard Cancer Institute, Boston, MA, USA; Harvard Data Sciences Initiative, Harvard University, Cambridge, MA, USA.
| |
Collapse
|
39
|
Combining Molecular, Imaging, and Clinical Data Analysis for Predicting Cancer Prognosis. Cancers (Basel) 2022; 14:cancers14133215. [PMID: 35804988 PMCID: PMC9265023 DOI: 10.3390/cancers14133215] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2022] [Revised: 06/24/2022] [Accepted: 06/27/2022] [Indexed: 02/04/2023] Open
Abstract
Simple Summary The rise of Big Data, the widespread use of Machine Learning, and the cheapening of omics techniques have allowed for the creation of more sophisticated and accurate models in biomedical research. This article presents the state-of-the-art predictive models of cancer prognosis that use multimodal data, considering clinical, molecular (omics and non-omics), and image data. The subject of study, the data modalities used, the data processing and modelling methods applied, the validation strategies involved, the integration strategies encompassed, and the evolution of prognostic predictive models are discussed. Finally, we discuss challenges and opportunities in this field of cancer research, with great potential impact on the clinical management of patients and, by extension, on the implementation of personalised and precision medicine. Abstract Cancer is one of the most detrimental diseases globally. Accordingly, the prognosis prediction of cancer patients has become a field of interest. In this review, we have gathered 43 state-of-the-art scientific papers published in the last 6 years that built cancer prognosis predictive models using multimodal data. We have defined the multimodality of data as four main types: clinical, anatomopathological, molecular, and medical imaging; and we have expanded on the information that each modality provides. The 43 studies were divided into three categories based on the modelling approach taken, and their characteristics were further discussed together with current issues and future trends. Research in this area has evolved from survival analysis through statistical modelling using mainly clinical and anatomopathological data to the prediction of cancer prognosis through a multi-faceted data-driven approach by the integration of complex, multimodal, and high-dimensional data containing multi-omics and medical imaging information and by applying Machine Learning and, more recently, Deep Learning techniques. This review concludes that cancer prognosis predictive multimodal models are capable of better stratifying patients, which can improve clinical management and contribute to the implementation of personalised medicine as well as provide new and valuable knowledge on cancer biology and its progression.
Collapse
|
40
|
Sinzinger F, Astaraki M, Smedby Ö, Moreno R. Spherical Convolutional Neural Networks for Survival Rate Prediction in Cancer Patients. Front Oncol 2022; 12:870457. [PMID: 35574400 PMCID: PMC9094614 DOI: 10.3389/fonc.2022.870457] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2022] [Accepted: 03/21/2022] [Indexed: 11/13/2022] Open
Abstract
Objective Survival Rate Prediction (SRP) is a valuable tool to assist in the clinical diagnosis and treatment planning of lung cancer patients. In recent years, deep learning (DL) based methods have shown great potential in medical image processing in general and SRP in particular. This study proposes a fully-automated method for SRP from computed tomography (CT) images, which combines an automatic segmentation of the tumor and a DL-based method for extracting rotational-invariant features. Methods In the first stage, the tumor is segmented from the CT image of the lungs. Here, we use a deep-learning-based method that entails a variational autoencoder to provide more information to a U-Net segmentation model. Next, the 3D volumetric image of the tumor is projected onto 2D spherical maps. These spherical maps serve as inputs for a spherical convolutional neural network that approximates the log risk for a generalized Cox proportional hazard model. Results The proposed method is compared with 17 baseline methods that combine different feature sets and prediction models using three publicly-available datasets: Lung1 (n=422), Lung3 (n=89), and H&N1 (n=136). We observed comparable C-index scores compared to the best-performing baseline methods in a 5-fold cross-validation on Lung1 (0.59 ± 0.03 vs. 0.62 ± 0.04). In comparison, it slightly outperforms all methods in inter-data set evaluation (0.64 vs. 0.63). The best-performing method from the first experiment reduced its performance to 0.61 and 0.62 for Lung3 and H&N1, respectively. Discussion The experiments suggest that the performance of spherical features is comparable with previous approaches, but they generalize better when applied to unseen datasets. That might imply that orientation-independent shape features are relevant for SRP. The performance of the proposed method was very similar, using manual and automatic segmentation methods. This makes the proposed model useful in cases where expert annotations are not available or difficult to obtain.
Collapse
Affiliation(s)
- Fabian Sinzinger
- Division of Biomedical Imaging, Department of Biomedical Engineering and Health Systems, KTH Royal Institute of Technology, Stockholm, Sweden
| | - Mehdi Astaraki
- Division of Biomedical Imaging, Department of Biomedical Engineering and Health Systems, KTH Royal Institute of Technology, Stockholm, Sweden.,Karolinska Institutet, Department of Oncology-Pathology, Karolinska Universitetssjukhuset, Stockholm, Sweden
| | - Örjan Smedby
- Division of Biomedical Imaging, Department of Biomedical Engineering and Health Systems, KTH Royal Institute of Technology, Stockholm, Sweden
| | - Rodrigo Moreno
- Division of Biomedical Imaging, Department of Biomedical Engineering and Health Systems, KTH Royal Institute of Technology, Stockholm, Sweden
| |
Collapse
|
41
|
Lee M. An Ensemble Deep Learning Model with a Gene Attention Mechanism for Estimating the Prognosis of Low-Grade Glioma. BIOLOGY 2022; 11:586. [PMID: 35453785 PMCID: PMC9027395 DOI: 10.3390/biology11040586] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Revised: 03/30/2022] [Accepted: 04/11/2022] [Indexed: 06/14/2023]
Abstract
While estimating the prognosis of low-grade glioma (LGG) is a crucial problem, it has not been extensively studied to introduce recent improvements in deep learning to address the problem. The attention mechanism is one of the significant advances; however, it is still unclear how attention mechanisms are used in gene expression data to estimate prognosis because they were designed for convolutional layers and word embeddings. This paper proposes an attention mechanism called gene attention for gene expression data. Additionally, a deep learning model for prognosis estimation of LGG is proposed using gene attention. The proposed Gene Attention Ensemble NETwork (GAENET) outperformed other conventional methods, including survival support vector machine and random survival forest. When evaluated by C-Index, the GAENET exhibited an improvement of 7.2% compared to the second-best model. In addition, taking advantage of the gene attention mechanism, HILS1 was discovered as the most significant prognostic gene in terms of deep learning training. While HILS1 is known as a pseudogene, HILS1 is a biomarker estimating the prognosis of LGG and has demonstrated a possibility of regulating the expression of other prognostic genes.
Collapse
Affiliation(s)
- Minhyeok Lee
- School of Electrical and Electronics Engineering, Chung-Ang University, Seoul 06974, Korea
| |
Collapse
|
42
|
Stahlschmidt SR, Ulfenborg B, Synnergren J. Multimodal deep learning for biomedical data fusion: a review. Brief Bioinform 2022; 23:6516346. [PMID: 35089332 PMCID: PMC8921642 DOI: 10.1093/bib/bbab569] [Citation(s) in RCA: 56] [Impact Index Per Article: 28.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2021] [Revised: 12/06/2021] [Accepted: 12/11/2021] [Indexed: 02/06/2023] Open
Abstract
Biomedical data are becoming increasingly multimodal and thereby capture the underlying complex relationships among biological processes. Deep learning (DL)-based data fusion strategies are a popular approach for modeling these nonlinear relationships. Therefore, we review the current state-of-the-art of such methods and propose a detailed taxonomy that facilitates more informed choices of fusion strategies for biomedical applications, as well as research on novel methods. By doing so, we find that deep fusion strategies often outperform unimodal and shallow approaches. Additionally, the proposed subcategories of fusion strategies show different advantages and drawbacks. The review of current methods has shown that, especially for intermediate fusion strategies, joint representation learning is the preferred approach as it effectively models the complex interactions of different levels of biological organization. Finally, we note that gradual fusion, based on prior biological knowledge or on search strategies, is a promising future research path. Similarly, utilizing transfer learning might overcome sample size limitations of multimodal data sets. As these data sets become increasingly available, multimodal DL approaches present the opportunity to train holistic models that can learn the complex regulatory dynamics behind health and disease.
Collapse
Affiliation(s)
| | | | - Jane Synnergren
- Systems Biology Research Center, University of Skövde, Sweden
| |
Collapse
|
43
|
A gradient tree boosting and network propagation derived pan-cancer survival network of the tumor microenvironment. iScience 2022; 25:103617. [PMID: 35106465 PMCID: PMC8786644 DOI: 10.1016/j.isci.2021.103617] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Revised: 11/12/2021] [Accepted: 12/09/2021] [Indexed: 12/22/2022] Open
Abstract
Predicting cancer survival from molecular data is an important aspect of biomedical research because it allows quantifying patient risks and thus individualizing therapy. We introduce XGBoost tree ensemble learning to predict survival from transcriptome data of 8,024 patients from 25 different cancer types and show highly competitive performance with state-of-the-art methods. To further improve plausibility of the machine learning approach we conducted two additional steps. In the first step, we applied pan-cancer training and showed that it substantially improves prognosis compared with cancer subtype-specific training. In the second step, we applied network propagation and inferred a pan-cancer survival network consisting of 103 genes. This network highlights cross-cohort features and is predictive for the tumor microenvironment and immune status of the patients. Our work demonstrates that pan-cancer learning combined with network propagation generalizes over multiple cancer types and identifies biologically plausible features that can serve as biomarkers for monitoring cancer survival. Highly performing cancer survival prediction with XGBoost Pan-cancer training outperforms single-cohort training Combined approach consisting of machine learning and network propagation Tumor microenvironment is most strongly involved in cancer survival prediction
Collapse
|