1
|
Zhang T, Ding R, Luong KD, Hsu W. Evaluating an information theoretic approach for selecting multimodal data fusion methods. J Biomed Inform 2025; 167:104833. [PMID: 40354908 DOI: 10.1016/j.jbi.2025.104833] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2024] [Revised: 03/17/2025] [Accepted: 04/20/2025] [Indexed: 05/14/2025]
Abstract
OBJECTIVE Interest has grown in combining radiology, pathology, genomic, and clinical data to improve the accuracy of diagnostic and prognostic predictions toward precision health. However, most existing works choose their datasets and modeling approaches empirically and in an ad hoc manner. A prior study proposed four partial information decomposition (PID)-based metrics to provide a theoretical understanding of multimodal data interactions: redundancy, uniqueness of each modality, and synergy. However, these metrics have only been evaluated in a limited collection of biomedical data, and the existing work does not elucidate the effect of parameter selection when calculating the PID metrics. In this work, we evaluate PID metrics on a wider range of biomedical data, including clinical, radiology, pathology, and genomic data, and propose potential improvements to the PID metrics. METHODS We apply the PID metrics to seven different modality pairs across four distinct cohorts (datasets). We compare and interpret trends in the resulting PID metrics and downstream model performance in these multimodal cohorts. The downstream tasks being evaluated include predicting the prognosis (either overall survival or recurrence) of patients with non-small cell lung cancer, prostate cancer, and glioblastoma. RESULTS We found that, while PID metrics are informative, solely relying on these metrics to decide on a fusion approach does not always yield a machine learning model with optimal performance. Of the seven different modality pairs, three had poor (0%), three had moderate (66%-89%), and only one had perfect (100%) consistency between the PID values and model performance. We propose two improvements to the PID metrics (determining the optimal parameters and uncertainty estimation) and identified areas where PID metrics could be further improved. CONCLUSION The current PID metrics are not accurate enough for estimating the multimodal data interactions and need to be improved before they can serve as a reliable tool. We propose improvements and provide suggestions for future work. Code: https://github.com/zhtyolivia/pid-multimodal.
Collapse
Affiliation(s)
- Tengyue Zhang
- Department of Bioengineering, Medical & Imaging Informatics, Department of Radiological Sciences, David Geffen School of Medicine at University of California, Los Angeles (UCLA), Los Angeles, 90024, CA, USA
| | - Ruiwen Ding
- Department of Bioengineering, Medical & Imaging Informatics, Department of Radiological Sciences, David Geffen School of Medicine at University of California, Los Angeles (UCLA), Los Angeles, 90024, CA, USA
| | - Kha-Dinh Luong
- Department of Computer Science, University of California, Santa Barbara (UCSB), Santa Barbara, 93117, CA, USA
| | - William Hsu
- Department of Bioengineering, Medical & Imaging Informatics, Department of Radiological Sciences, David Geffen School of Medicine at University of California, Los Angeles (UCLA), Los Angeles, 90024, CA, USA.
| |
Collapse
|
2
|
Verma S, Magazzù G, Eftekhari N, Lou T, Gilhespy A, Occhipinti A, Angione C. Cross-attention enables deep learning on limited omics-imaging-clinical data of 130 lung cancer patients. CELL REPORTS METHODS 2024; 4:100817. [PMID: 38981473 PMCID: PMC11294841 DOI: 10.1016/j.crmeth.2024.100817] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Revised: 04/18/2024] [Accepted: 06/17/2024] [Indexed: 07/11/2024]
Abstract
Deep-learning tools that extract prognostic factors derived from multi-omics data have recently contributed to individualized predictions of survival outcomes. However, the limited size of integrated omics-imaging-clinical datasets poses challenges. Here, we propose two biologically interpretable and robust deep-learning architectures for survival prediction of non-small cell lung cancer (NSCLC) patients, learning simultaneously from computed tomography (CT) scan images, gene expression data, and clinical information. The proposed models integrate patient-specific clinical, transcriptomic, and imaging data and incorporate Kyoto Encyclopedia of Genes and Genomes (KEGG) and Reactome pathway information, adding biological knowledge within the learning process to extract prognostic gene biomarkers and molecular pathways. While both models accurately stratify patients in high- and low-risk groups when trained on a dataset of only 130 patients, introducing a cross-attention mechanism in a sparse autoencoder significantly improves the performance, highlighting tumor regions and NSCLC-related genes as potential biomarkers and thus offering a significant methodological advancement when learning from small imaging-omics-clinical samples.
Collapse
Affiliation(s)
- Suraj Verma
- School of Computing, Engineering and Digital Technologies, Teesside University, Middlesbrough, UK
| | | | | | - Thai Lou
- Gateshead Health NHS Foundation Trust, Gateshead, UK
| | - Alex Gilhespy
- South Tyneside and Sunderland NHS Foundation Trust, Sunderland, UK
| | - Annalisa Occhipinti
- School of Computing, Engineering and Digital Technologies, Teesside University, Middlesbrough, UK; Centre for Digital Innovation, Teesside University, Middlesbrough, UK; National Horizons Centre, Teesside University, Darlington, UK
| | - Claudio Angione
- School of Computing, Engineering and Digital Technologies, Teesside University, Middlesbrough, UK; Centre for Digital Innovation, Teesside University, Middlesbrough, UK; National Horizons Centre, Teesside University, Darlington, UK.
| |
Collapse
|
3
|
Corr F, Grimm D, Saß B, Pojskić M, Bartsch JW, Carl B, Nimsky C, Bopp MHA. Radiogenomic Predictors of Recurrence in Glioblastoma—A Systematic Review. J Pers Med 2022; 12:jpm12030402. [PMID: 35330402 PMCID: PMC8952807 DOI: 10.3390/jpm12030402] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2022] [Revised: 02/23/2022] [Accepted: 03/01/2022] [Indexed: 12/10/2022] Open
Abstract
Glioblastoma, as the most aggressive brain tumor, is associated with a poor prognosis and outcome. To optimize prognosis and clinical therapy decisions, there is an urgent need to stratify patients with increased risk for recurrent tumors and low therapeutic success to optimize individual treatment. Radiogenomics establishes a link between radiological and pathological information. This review provides a state-of-the-art picture illustrating the latest developments in the use of radiogenomic markers regarding prognosis and their potential for monitoring recurrence. Databases PubMed, Google Scholar, and Cochrane Library were searched. Inclusion criteria were defined as diagnosis of glioblastoma with histopathological and radiological follow-up. Out of 321 reviewed articles, 43 articles met these inclusion criteria. Included studies were analyzed for the frequency of radiological and molecular tumor markers whereby radiogenomic associations were analyzed. Six main associations were described: radiogenomic prognosis, MGMT status, IDH, EGFR status, molecular subgroups, and tumor location. Prospective studies analyzing prognostic features of glioblastoma together with radiological features are lacking. By reviewing the progress in the development of radiogenomic markers, we provide insights into the potential efficacy of such an approach for clinical routine use eventually enabling early identification of glioblastoma recurrence and therefore supporting a further personalized monitoring and treatment strategy.
Collapse
Affiliation(s)
- Felix Corr
- Department of Neurosurgery, University of Marburg, Baldingerstrasse, 35043 Marburg, Germany; (D.G.); (B.S.); (M.P.); (J.W.B.); (B.C.); (C.N.); (M.H.A.B.)
- EDU Institute of Higher Education, Villa Bighi, Chaplain’s House, KKR 1320 Kalkara, Malta
- Correspondence:
| | - Dustin Grimm
- Department of Neurosurgery, University of Marburg, Baldingerstrasse, 35043 Marburg, Germany; (D.G.); (B.S.); (M.P.); (J.W.B.); (B.C.); (C.N.); (M.H.A.B.)
- EDU Institute of Higher Education, Villa Bighi, Chaplain’s House, KKR 1320 Kalkara, Malta
| | - Benjamin Saß
- Department of Neurosurgery, University of Marburg, Baldingerstrasse, 35043 Marburg, Germany; (D.G.); (B.S.); (M.P.); (J.W.B.); (B.C.); (C.N.); (M.H.A.B.)
| | - Mirza Pojskić
- Department of Neurosurgery, University of Marburg, Baldingerstrasse, 35043 Marburg, Germany; (D.G.); (B.S.); (M.P.); (J.W.B.); (B.C.); (C.N.); (M.H.A.B.)
| | - Jörg W. Bartsch
- Department of Neurosurgery, University of Marburg, Baldingerstrasse, 35043 Marburg, Germany; (D.G.); (B.S.); (M.P.); (J.W.B.); (B.C.); (C.N.); (M.H.A.B.)
- Center for Mind, Brain and Behavior (CMBB), 35043 Marburg, Germany
| | - Barbara Carl
- Department of Neurosurgery, University of Marburg, Baldingerstrasse, 35043 Marburg, Germany; (D.G.); (B.S.); (M.P.); (J.W.B.); (B.C.); (C.N.); (M.H.A.B.)
- Department of Neurosurgery, Helios Dr. Horst Schmidt Kliniken, Ludwig-Erhard-Strasse 100, 65199 Wiesbaden, Germany
| | - Christopher Nimsky
- Department of Neurosurgery, University of Marburg, Baldingerstrasse, 35043 Marburg, Germany; (D.G.); (B.S.); (M.P.); (J.W.B.); (B.C.); (C.N.); (M.H.A.B.)
- Center for Mind, Brain and Behavior (CMBB), 35043 Marburg, Germany
| | - Miriam H. A. Bopp
- Department of Neurosurgery, University of Marburg, Baldingerstrasse, 35043 Marburg, Germany; (D.G.); (B.S.); (M.P.); (J.W.B.); (B.C.); (C.N.); (M.H.A.B.)
- Center for Mind, Brain and Behavior (CMBB), 35043 Marburg, Germany
| |
Collapse
|
4
|
Liu Q, Hu P. Extendable and explainable deep learning for pan-cancer radiogenomics research. Curr Opin Chem Biol 2022; 66:102111. [PMID: 34999476 DOI: 10.1016/j.cbpa.2021.102111] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2021] [Revised: 12/06/2021] [Accepted: 12/13/2021] [Indexed: 12/12/2022]
Abstract
Radiogenomics is a field where medical images and genomic profiles are jointly analyzed to answer critical clinical questions. Specifically, people want to identify non-invasive imaging biomarkers that are associated with both genomic features and clinical outcomes. Deep learning is an advanced computer science technique that has been applied in many fields, including medical image and genomic data analysis. This review summarizes the current state of deep learning in pan-cancer radiogenomic research, discusses its limitations, and indicates the potential future directions. Traditional machine learning in radiomics, genomics, and radiogenomics have also been briefly discussed. We also summarize the main pan-cancer radiogenomic research resources. Two characteristics of deep learning are emphasized when discussing its application to pan-cancer radiogenomics, which are extendibility and explainability.
Collapse
Affiliation(s)
- Qian Liu
- Department of Biochemistry and Medical Genetics, University of Manitoba, Winnipeg, Manitoba, R3E 0W3, Canada; Department of Computer Science, University of Manitoba, Winnipeg, Manitoba, R3E 0W3, Canada; Department of Statistics, University of Manitoba, Winnipeg, Manitoba, R3E 0W3, Canada.
| | - Pingzhao Hu
- Department of Biochemistry and Medical Genetics, University of Manitoba, Winnipeg, Manitoba, R3E 0W3, Canada; Department of Computer Science, University of Manitoba, Winnipeg, Manitoba, R3E 0W3, Canada.
| |
Collapse
|
5
|
McGrath SP, Benton ML, Tavakoli M, Tatonetti NP. Predictions, Pivots, and a Pandemic: a Review of 2020's Top Translational Bioinformatics Publications. Yearb Med Inform 2021; 30:219-225. [PMID: 34479393 PMCID: PMC8416221 DOI: 10.1055/s-0041-1726540] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
OBJECTIVES Provide an overview of the emerging themes and notable papers which were published in 2020 in the field of Bioinformatics and Translational Informatics (BTI) for the International Medical Informatics Association Yearbook. METHODS A team of 16 individuals scanned the literature from the past year. Using a scoring rubric, papers were evaluated on their novelty, importance, and objective quality. 1,224 Medical Subject Headings (MeSH) terms extracted from these papers were used to identify themes and research focuses. The authors then used the scoring results to select notable papers and trends presented in this manuscript. RESULTS The search phase identified 263 potential papers and central themes of coronavirus disease 2019 (COVID-19), machine learning, and bioinformatics were examined in greater detail. CONCLUSIONS When addressing a once in a centruy pandemic, scientists worldwide answered the call, with informaticians playing a critical role. Productivity and innovations reached new heights in both TBI and science, but significant research gaps remain.
Collapse
Affiliation(s)
- Scott P. McGrath
- CITRIS Health, University of California Berkeley, Berkeley, CA, USA
| | | | - Maryam Tavakoli
- MTERMS Lab, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | | |
Collapse
|
6
|
Smedley NF, Aberle DR, Hsu W. Using deep neural networks and interpretability methods to identify gene expression patterns that predict radiomic features and histology in non-small cell lung cancer. J Med Imaging (Bellingham) 2021; 8:031906. [PMID: 33977113 PMCID: PMC8105647 DOI: 10.1117/1.jmi.8.3.031906] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2020] [Accepted: 04/13/2021] [Indexed: 01/06/2023] Open
Abstract
Purpose: Integrative analysis combining diagnostic imaging and genomic information can uncover biological insights into lesions that are visible on radiologic images. We investigate techniques for interrogating a deep neural network trained to predict quantitative image (radiomic) features and histology from gene expression in non-small cell lung cancer (NSCLC). Approach: Using 262 training and 89 testing cases from two public datasets, deep feedforward neural networks were trained to predict the values of 101 computed tomography (CT) radiomic features and histology. A model interrogation method called gene masking was used to derive the learned associations between subsets of genes and a radiomic feature or histology class [adenocarcinoma (ADC), squamous cell, and other]. Results: Overall, neural networks outperformed other classifiers. In testing, neural networks classified histology with area under the receiver operating characteristic curves (AUCs) of 0.86 (ADC), 0.91 (squamous cell), and 0.71 (other). Classification performance of radiomics features ranged from 0.42 to 0.89 AUC. Gene masking analysis revealed new and previously reported associations. For example, hypoxia genes predicted histology ( > 0.90 AUC ). Previously published gene signatures for classifying histology were also predictive in our model ( > 0.80 AUC ). Gene sets related to the immune or cardiac systems and cell development processes were predictive ( > 0.70 AUC ) of several different radiomic features. AKT signaling, tumor necrosis factor, and Rho gene sets were each predictive of tumor textures. Conclusions: This work demonstrates neural networks' ability to map gene expressions to radiomic features and histology types in NSCLC and to interpret the models to identify predictive genes associated with each feature or type.
Collapse
Affiliation(s)
- Nova F Smedley
- University of California, Los Angeles, Department of Radiological Sciences, Los Angeles, California, United States.,University of California, Los Angeles, Department of Bioengineering, Los Angeles, California, United States
| | - Denise R Aberle
- University of California, Los Angeles, Department of Radiological Sciences, Los Angeles, California, United States.,University of California, Los Angeles, Department of Bioengineering, Los Angeles, California, United States
| | - William Hsu
- University of California, Los Angeles, Department of Radiological Sciences, Los Angeles, California, United States.,University of California, Los Angeles, Department of Bioengineering, Los Angeles, California, United States.,University of California, Los Angeles, Bioinformatics Interdepartmental Program, Los Angeles, California, United States
| |
Collapse
|