1
|
Bitterman DS, Gensheimer MF, Jaffray D, Pryma DA, Jiang SB, Morin O, Ginart JB, Upadhaya T, Vallis KA, Buatti JM, Deasy J, Hsiao HT, Chung C, Fuller CD, Greenspan E, Cloyd-Warwick K, Courdy S, Mao A, Barnholtz-Sloan J, Topaloglu U, Hands I, Maurer I, Terry M, Curran WJ, Le QT, Nadaf S, Kibbe W. Cancer Informatics for Cancer Centers: Sharing Ideas on How to Build an Artificial Intelligence-Ready Informatics Ecosystem for Radiation Oncology. JCO Clin Cancer Inform 2023; 7:e2300136. [PMID: 38055914 PMCID: PMC10703125 DOI: 10.1200/cci.23.00136] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Revised: 08/15/2023] [Accepted: 10/16/2023] [Indexed: 12/08/2023] Open
Abstract
In August 2022, the Cancer Informatics for Cancer Centers brought together cancer informatics leaders for its biannual symposium, Precision Medicine Applications in Radiation Oncology, co-chaired by Quynh-Thu Le, MD (Stanford University), and Walter J. Curran, MD (GenesisCare). Over the course of 3 days, presenters discussed a range of topics relevant to radiation oncology and the cancer informatics community more broadly, including biomarker development, decision support algorithms, novel imaging tools, theranostics, and artificial intelligence (AI) for the radiotherapy workflow. Since the symposium, there has been an impressive shift in the promise and potential for integration of AI in clinical care, accelerated in large part by major advances in generative AI. AI is now poised more than ever to revolutionize cancer care. Radiation oncology is a field that uses and generates a large amount of digital data and is therefore likely to be one of the first fields to be transformed by AI. As experts in the collection, management, and analysis of these data, the informatics community will take a leading role in ensuring that radiation oncology is prepared to take full advantage of these technological advances. In this report, we provide highlights from the symposium, which took place in Santa Barbara, California, from August 29 to 31, 2022. We discuss lessons learned from the symposium for data acquisition, management, representation, and sharing, and put these themes into context to prepare radiation oncology for the successful and safe integration of AI and informatics technologies.
Collapse
Affiliation(s)
- Danielle S. Bitterman
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Boston, MA
- Department of Radiation Oncology, Brigham and Women's Hospital/Dana-Farber Cancer Institute, Boston, MA
| | - Michael F. Gensheimer
- Department of Radiation Oncology, Stanford University School of Medicine, Stanford, CA
| | - David Jaffray
- Department of Radiation Physics, M.D. Anderson Cancer Center, Houston, TX
| | - Daniel A. Pryma
- Abramson Cancer Center, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA
| | - Steve B. Jiang
- Medical Artificial Intelligence and Automation Laboratory and Department of Radiation Oncology, University of Texas Southwestern Medical Center, Dallas, TX
| | - Olivier Morin
- Department of Radiation Oncology, MEDomics Laboratory, University of California San Francisco, San Francisco, CA
| | - Jorge Barrios Ginart
- Department of Radiation Oncology, MEDomics Laboratory, University of California San Francisco, San Francisco, CA
| | - Taman Upadhaya
- Department of Radiation Oncology, MEDomics Laboratory, University of California San Francisco, San Francisco, CA
| | - Katherine A. Vallis
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Boston, MA
| | - John M. Buatti
- Department of Oncology, University of Oxford, Oxford, United Kingdom
| | - Joseph Deasy
- Department of Radiation Oncology, University of Iowa Carver College of Medicine, Iowa City, IA
| | - H. Timothy Hsiao
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, New York, NY
| | - Caroline Chung
- Department of Scientific Affairs, American Society for Radiation Oncology, Arlington, VA
| | - Clifton D. Fuller
- Department of Scientific Affairs, American Society for Radiation Oncology, Arlington, VA
| | - Emily Greenspan
- Department of Radiation Oncology, M.D. Anderson Cancer Center, Houston, TX
| | - Kristy Cloyd-Warwick
- Center for Biomedical Informatics and Information Technology, National Cancer Institute, Rockville, MD
| | | | | | - Jill Barnholtz-Sloan
- Department of Radiation Oncology, M.D. Anderson Cancer Center, Houston, TX
- Center for Informatics, Digital Vertical, City of Hope National Comprehensive Cancer Center, Los Angeles, CA
| | - Umit Topaloglu
- Department of Radiation Oncology, M.D. Anderson Cancer Center, Houston, TX
| | - Isaac Hands
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD
- Cancer Research Informatics Shared Resource Facility, University of Kentucky Markey Cancer Center, Lexington, NY
| | | | | | | | - Quynh-Thu Le
- Department of Radiation Oncology, Stanford University School of Medicine, Stanford, CA
| | - Sorena Nadaf
- Department of Radiation Oncology, Emory University, Atlanta, GA
| | - Warren Kibbe
- Cancer Center Informatics Society, Los Angeles, CA
| |
Collapse
|
2
|
Zhu J, Oh JH, Simhal AK, Elkin R, Norton L, Deasy JO, Tannenbaum A. Geometric graph neural networks on multi-omics data to predict cancer survival outcomes. Comput Biol Med 2023; 163:107117. [PMID: 37329617 PMCID: PMC10638676 DOI: 10.1016/j.compbiomed.2023.107117] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Revised: 05/25/2023] [Accepted: 05/30/2023] [Indexed: 06/19/2023]
Abstract
The advance of sequencing technologies has enabled a thorough molecular characterization of the genome in human cancers. To improve patient prognosis predictions and subsequent treatment strategies, it is imperative to develop advanced computational methods to analyze large-scale, high-dimensional genomic data. However, traditional machine learning methods face a challenge in handling the high-dimensional, low-sample size problem that is shown in most genomic data sets. To address this, our group has developed geometric network analysis techniques on multi-omics data in connection with prior biological knowledge derived from protein-protein interactions (PPIs) or pathways. Geometric features obtained from the genomic network, such as Ollivier-Ricci curvature and the invariant measure of the associated Markov chain, have been shown to be predictive of survival outcomes in various cancers. In this study, we propose a novel supervised deep learning method called geometric graph neural network (GGNN) that incorporates such geometric features into deep learning for enhanced predictive power and interpretability. More specifically, we utilize a state-of-the-art graph neural network with sparse connections between the hidden layers based on known biology of the PPI network and pathway information. Geometric features along with multi-omics data are then incorporated into the corresponding layers. The proposed approach utilizes a local-global principle in such a manner that highly predictive features are selected at the front layers and fed directly to the last layer for multivariable Cox proportional-hazards regression modeling. The method was applied to multi-omics data from the CoMMpass study of multiple myeloma and ten major cancers in The Cancer Genome Atlas (TCGA). In most experiments, our method showed superior predictive performance compared to other alternative methods.
Collapse
Affiliation(s)
- Jiening Zhu
- Department of Applied Mathematics & Statistics, Stony Brook University, NY, USA.
| | - Jung Hun Oh
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, NY, USA.
| | - Anish K Simhal
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, NY, USA.
| | - Rena Elkin
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, NY, USA.
| | - Larry Norton
- Department of Medicine, Memorial Sloan Kettering Cancer Center, NY, USA.
| | - Joseph O Deasy
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, NY, USA.
| | - Allen Tannenbaum
- Department of Applied Mathematics & Statistics, Stony Brook University, NY, USA; Department of Computer Science, Stony Brook University, NY, USA.
| |
Collapse
|
3
|
Pouryahya M, Oh JH, Javanmard P, Mathews JC, Belkhatir Z, Deasy JO, Tannenbaum AR. aWCluster: A Novel Integrative Network-Based Clustering of Multiomics for Subtype Analysis of Cancer Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:1472-1483. [PMID: 33226952 PMCID: PMC9518829 DOI: 10.1109/tcbb.2020.3039511] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
The remarkable growth of multi-platform genomic profiles has led to the challenge of multiomics data integration. In this study, we present a novel network-based multiomics clustering founded on the Wasserstein distance from optimal mass transport. This distance has many important geometric properties making it a suitable choice for application in machine learning and clustering. Our proposed method of aggregating multiomics and Wasserstein distance clustering (aWCluster) is applied to breast carcinoma as well as bladder carcinoma, colorectal adenocarcinoma, renal carcinoma, lung non-small cell adenocarcinoma, and endometrial carcinoma from The Cancer Genome Atlas project. Subtypes were characterized by the concordant effect of mRNA expression, DNA copy number alteration, and DNA methylation of genes and their neighbors in the interaction network. aWCluster successfully clusters all cancer types into classes with significantly different survival rates. Also, a gene ontology enrichment analysis of significant genes in the low survival subgroup of breast cancer leads to the well-known phenomenon of tumor hypoxia and the transcription factor ETS1 whose expression is induced by hypoxia. We believe aWCluster has the potential to discover novel subtypes and biomarkers by accentuating the genes that have concordant multiomics measurements in their interaction network, which are challenging to find without the network inference or with single omics analysis.
Collapse
|
4
|
Zhu J, Oh JH, Deasy JO, Tannenbaum AR. vWCluster: Vector-valued optimal transport for network based clustering using multi-omics data in breast cancer. PLoS One 2022; 17:e0265150. [PMID: 35286348 PMCID: PMC8920287 DOI: 10.1371/journal.pone.0265150] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2021] [Accepted: 02/23/2022] [Indexed: 12/28/2022] Open
Abstract
In this paper, we present a network-based clustering method, called vector Wasserstein clustering (vWCluster), based on the vector-valued Wasserstein distance derived from optimal mass transport (OMT) theory. This approach allows for the natural integration of multi-layer representations of data in a given network from which one derives clusters via a hierarchical clustering approach. In this study, we applied the methodology to multi-omics data from the two largest breast cancer studies. The resultant clusters showed significantly different survival rates in Kaplan-Meier analysis in both datasets. CIBERSORT scores were compared among the identified clusters. Out of the 22 CIBERSORT immune cell types, 9 were commonly significantly different in both datasets, suggesting the difference of tumor immune microenvironment in the clusters. vWCluster can aggregate multi-omics data represented as a vectorial form in a network with multiple layers, taking into account the concordant effect of heterogeneous data, and further identify subgroups of tumors in terms of mortality.
Collapse
Affiliation(s)
- Jiening Zhu
- Department of Applied Mathematics & Statistics, Stony Brook University, New York, NY, United States of America
| | - Jung Hun Oh
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, New York, NY, United States of America
| | - Joseph O. Deasy
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, New York, NY, United States of America
| | - Allen R. Tannenbaum
- Department of Applied Mathematics & Statistics, Stony Brook University, New York, NY, United States of America
- Departments of Computer Science, Stony Brook University, New York, NY, United States of America
| |
Collapse
|
5
|
Pouryahya M, Oh JH, Mathews JC, Belkhatir Z, Moosmüller C, Deasy JO, Tannenbaum AR. Pan-Cancer Prediction of Cell-Line Drug Sensitivity Using Network-Based Methods. Int J Mol Sci 2022; 23:ijms23031074. [PMID: 35163005 PMCID: PMC8835038 DOI: 10.3390/ijms23031074] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2021] [Revised: 01/15/2022] [Accepted: 01/17/2022] [Indexed: 01/02/2023] Open
Abstract
The development of reliable predictive models for individual cancer cell lines to identify an optimal cancer drug is a crucial step to accelerate personalized medicine, but vast differences in cancer cell lines and drug characteristics make it quite challenging to develop predictive models that result in high predictive power and explain the similarity of cell lines or drugs. Our study proposes a novel network-based methodology that breaks the problem into smaller, more interpretable problems to improve the predictive power of anti-cancer drug responses in cell lines. For the drug-sensitivity study, we used the GDSC database for 915 cell lines and 200 drugs. The theory of optimal mass transport was first used to separately cluster cell lines and drugs, using gene-expression profiles and extensive cheminformatic drug features, represented in a form of data networks. To predict cell-line specific drug responses, random forest regression modeling was separately performed for each cell-line drug cluster pair. Post-modeling biological analysis was further performed to identify potential biological correlates associated with drug responses. The network-based clustering method resulted in 30 distinct cell-line drug cluster pairs. Predictive modeling on each cell-line-drug cluster outperformed alternative computational methods in predicting drug responses. We found that among the four drugs top-ranked with respect to prediction performance, three targeted the PI3K/mTOR signaling pathway. Predictive modeling on clustered subsets of cell lines and drugs improved the prediction accuracy of cell-line specific drug responses. Post-modeling analysis identified plausible biological processes associated with drug responses.
Collapse
Affiliation(s)
- Maryam Pouryahya
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA; (M.P.); (J.C.M.); (J.O.D.)
| | - Jung Hun Oh
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA; (M.P.); (J.C.M.); (J.O.D.)
- Correspondence:
| | - James C. Mathews
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA; (M.P.); (J.C.M.); (J.O.D.)
| | - Zehor Belkhatir
- School of Engineering and Sustainable Development, De Montfort University, Leicester LE1 9BH, UK;
| | - Caroline Moosmüller
- Department of Mathematics, University of California at San Diego, La Jolla, CA 92093, USA;
| | - Joseph O. Deasy
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA; (M.P.); (J.C.M.); (J.O.D.)
| | - Allen R. Tannenbaum
- Departments of Computer Science and Applied Mathematics & Statistics, Stony Brook University, Stony Brook, NY 11794, USA;
| |
Collapse
|
6
|
Elkin R, Oh JH, Liu YL, Selenica P, Weigelt B, Reis-Filho JS, Zamarin D, Deasy JO, Norton L, Levine AJ, Tannenbaum AR. Geometric network analysis provides prognostic information in patients with high grade serous carcinoma of the ovary treated with immune checkpoint inhibitors. NPJ Genom Med 2021; 6:99. [PMID: 34819508 PMCID: PMC8613272 DOI: 10.1038/s41525-021-00259-9] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Accepted: 10/15/2021] [Indexed: 01/08/2023] Open
Abstract
Network analysis methods can potentially quantify cancer aberrations in gene networks without introducing fitted parameters or variable selection. A new network curvature-based method is introduced to provide an integrated measure of variability within cancer gene networks. The method is applied to high-grade serous ovarian cancers (HGSOCs) to predict response to immune checkpoint inhibitors (ICIs) and to rank key genes associated with prognosis. Copy number alterations (CNAs) from targeted and whole-exome sequencing data were extracted for HGSOC patients (n = 45) treated with ICIs. CNAs at a gene level were represented on a protein–protein interaction network to define patient-specific networks with a fixed topology. A version of Ollivier–Ricci curvature was used to identify genes that play a potentially key role in response to immunotherapy and further to stratify patients at high risk of mortality. Overall survival (OS) was defined as the time from the start of ICI treatment to either death or last follow-up. Kaplan–Meier analysis with log-rank test was performed to assess OS between the high and low curvature classified groups. The network curvature analysis stratified patients at high risk of mortality with p = 0.00047 in Kaplan–Meier analysis in HGSOC patients receiving ICI. Genes with high curvature were in accordance with CNAs relevant to ovarian cancer. Network curvature using CNAs has the potential to be a novel predictor for OS in HGSOC patients treated with immunotherapy.
Collapse
Affiliation(s)
- Rena Elkin
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA
| | - Jung Hun Oh
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA
| | - Ying L Liu
- Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA
| | - Pier Selenica
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA
| | - Britta Weigelt
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA
| | - Jorge S Reis-Filho
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA
| | - Dmitriy Zamarin
- Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA
| | - Joseph O Deasy
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA
| | - Larry Norton
- Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA
| | | | - Allen R Tannenbaum
- Departments of Computer Science and Applied Mathematics & Statistics, Stony Brook University, Stony Brook, NY, 11794, USA.
| |
Collapse
|
7
|
Rijs Z, Jeremiasse B, Shifai N, Gelderblom H, Sier CFM, Vahrmeijer AL, van Leeuwen FWB, van der Steeg AFW, van de Sande MAJ. Introducing Fluorescence-Guided Surgery for Pediatric Ewing, Osteo-, and Rhabdomyosarcomas: A Literature Review. Biomedicines 2021; 9:biomedicines9101388. [PMID: 34680505 PMCID: PMC8533294 DOI: 10.3390/biomedicines9101388] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Revised: 09/30/2021] [Accepted: 10/01/2021] [Indexed: 02/07/2023] Open
Abstract
Sarcomas are a rare heterogeneous group of malignant neoplasms of mesenchymal origin which represent approximately 13% of all cancers in pediatric patients. The most prevalent pediatric bone sarcomas are osteosarcoma (OS) and Ewing sarcoma (ES). Rhabdomyosarcoma (RMS) is the most frequently occurring pediatric soft tissue sarcoma. The median age of OS and ES is approximately 17 years, so this disease is also commonly seen in adults while non-pleiomorphic RMS is rare in the adult population. The mainstay of all treatment regimens is multimodal treatment containing chemotherapy, surgical resection, and sometimes (neo)adjuvant radiotherapy. A clear resection margin improves both local control and overall survival and should be the goal during surgery with a curative intent. Real-time intraoperative fluorescence-guided imaging could facilitate complete resections by visualizing tumor tissue during surgery. This review evaluates whether non-targeted and targeted fluorescence-guided surgery (FGS) could be beneficial for pediatric OS, ES, and RMS patients. Necessities for clinical implementation, current literature, and the positive as well as negative aspects of non-targeted FGS using the NIR dye Indocyanine Green (ICG) were evaluated. In addition, we provide an overview of targets that could potentially be used for FGS in OS, ES, and RMS. Then, due to the time- and cost-efficient translational perspective, we elaborate on the use of antibody-based tracers as well as their disadvantages and alternatives. Finally, we conclude with recommendations for the experiments needed before FGS can be implemented for pediatric OS, ES, and RMS patients.
Collapse
Affiliation(s)
- Zeger Rijs
- Department of Orthopedic Surgery, Leiden University Medical Center, Albinusdreef 2, 2333 ZA Leiden, The Netherlands; (N.S.); (M.A.J.v.d.S.)
- Correspondence: ; Tel.: +31-641-637-074
| | - Bernadette Jeremiasse
- Department of Surgery, Princess Maxima Center for Pediatric Oncology, Heidelberglaan 25, 3584 CS Utrecht, The Netherlands; (B.J.); (A.F.W.v.d.S.)
| | - Naweed Shifai
- Department of Orthopedic Surgery, Leiden University Medical Center, Albinusdreef 2, 2333 ZA Leiden, The Netherlands; (N.S.); (M.A.J.v.d.S.)
| | - Hans Gelderblom
- Department of Medical Oncology, Leiden University Medical Center, Albinusdreef 2, 2333 ZA Leiden, The Netherlands;
| | - Cornelis F. M. Sier
- Department of Surgery, Leiden University Medical Center, Albinusdreef 2, 2333 ZA Leiden, The Netherlands; (C.F.M.S.); (A.L.V.)
- Percuros BV, 2333 CL Leiden, The Netherlands
| | - Alexander L. Vahrmeijer
- Department of Surgery, Leiden University Medical Center, Albinusdreef 2, 2333 ZA Leiden, The Netherlands; (C.F.M.S.); (A.L.V.)
| | - Fijs W. B. van Leeuwen
- Interventional Molecular Imaging Laboratory, Department of Radiology, Leiden University Medical Center, Albinusdreef 2, 2333 ZA Leiden, The Netherlands;
| | - Alida F. W. van der Steeg
- Department of Surgery, Princess Maxima Center for Pediatric Oncology, Heidelberglaan 25, 3584 CS Utrecht, The Netherlands; (B.J.); (A.F.W.v.d.S.)
| | - Michiel A. J. van de Sande
- Department of Orthopedic Surgery, Leiden University Medical Center, Albinusdreef 2, 2333 ZA Leiden, The Netherlands; (N.S.); (M.A.J.v.d.S.)
| |
Collapse
|
8
|
Oh JH, Apte AP, Katsoulakis E, Riaz N, Hatzoglou V, Yu Y, Mahmood U, Veeraraghavan H, Pouryahya M, Iyer A, Shukla-Dave A, Tannenbaum A, Lee NY, Deasy JO. Reproducibility of radiomic features using network analysis and its application in Wasserstein k-means clustering. J Med Imaging (Bellingham) 2021; 8:031904. [PMID: 33954225 PMCID: PMC8085581 DOI: 10.1117/1.jmi.8.3.031904] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2020] [Accepted: 04/02/2021] [Indexed: 12/24/2022] Open
Abstract
Purpose: The goal of this study is to develop innovative methods for identifying radiomic features that are reproducible over varying image acquisition settings. Approach: We propose a regularized partial correlation network to identify reliable and reproducible radiomic features. This approach was tested on two radiomic feature sets generated using two different reconstruction methods on computed tomography (CT) scans from a cohort of 47 lung cancer patients. The largest common network component between the two networks was tested on phantom data consisting of five cancer samples. To further investigate whether radiomic features found can identify phenotypes, we propose a k -means clustering algorithm coupled with the optimal mass transport theory. This approach following the regularized partial correlation network analysis was tested on CT scans from 77 head and neck squamous cell carcinoma (HNSCC) patients in the Cancer Imaging Archive (TCIA) and validated using an independent dataset. Results: A set of common radiomic features was found in relatively large network components between the resultant two partial correlation networks resulting from a cohort of lung cancer patients. The reliability and reproducibility of those radiomic features were further validated on phantom data using the Wasserstein distance. Further analysis using the network-based Wasserstein k -means algorithm on the TCIA HNSCC data showed that the resulting clusters separate tumor subsites as well as HPV status, and this was validated on an independent dataset. Conclusion: We showed that a network-based analysis enables identifying reproducible radiomic features and use of the selected set of features can enhance clustering results.
Collapse
Affiliation(s)
- Jung Hun Oh
- Memorial Sloan Kettering Cancer Center, Department of Medical Physics, New York, United States
| | - Aditya P Apte
- Memorial Sloan Kettering Cancer Center, Department of Medical Physics, New York, United States
| | - Evangelia Katsoulakis
- Veterans Affairs, James A Haley, Department of Radiation Oncology, Tampa, Florida, United States
| | - Nadeem Riaz
- Memorial Sloan Kettering Cancer Center, Department of Radiation Oncology, New York, United States
| | - Vaios Hatzoglou
- Memorial Sloan Kettering Cancer Center, Department of Radiology, New York, United States
| | - Yao Yu
- Memorial Sloan Kettering Cancer Center, Department of Radiation Oncology, New York, United States
| | - Usman Mahmood
- Memorial Sloan Kettering Cancer Center, Department of Medical Physics, New York, United States
| | - Harini Veeraraghavan
- Memorial Sloan Kettering Cancer Center, Department of Medical Physics, New York, United States
| | - Maryam Pouryahya
- Memorial Sloan Kettering Cancer Center, Department of Medical Physics, New York, United States
| | - Aditi Iyer
- Memorial Sloan Kettering Cancer Center, Department of Medical Physics, New York, United States
| | - Amita Shukla-Dave
- Memorial Sloan Kettering Cancer Center, Department of Medical Physics, New York, United States
| | - Allen Tannenbaum
- Stony Brook University, Department of Computer Science, Stony Brook, New York, United States.,Stony Brook University, Department of Applied Mathematics and Statistics, Stony Brook, New York, United States
| | - Nancy Y Lee
- Memorial Sloan Kettering Cancer Center, Department of Radiation Oncology, New York, United States
| | - Joseph O Deasy
- Memorial Sloan Kettering Cancer Center, Department of Medical Physics, New York, United States
| |
Collapse
|
9
|
Oh JH, Pouryahya M, Iyer A, Apte AP, Deasy JO, Tannenbaum A. A novel kernel Wasserstein distance on Gaussian measures: An application of identifying dental artifacts in head and neck computed tomography. Comput Biol Med 2020; 120:103731. [PMID: 32217284 PMCID: PMC7237301 DOI: 10.1016/j.compbiomed.2020.103731] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2020] [Revised: 03/18/2020] [Accepted: 03/22/2020] [Indexed: 01/30/2023]
Abstract
The Wasserstein distance is a powerful metric based on the theory of optimal mass transport. It gives a natural measure of the distance between two distributions with a wide range of applications. In contrast to a number of the common divergences on distributions such as Kullback-Leibler or Jensen-Shannon, it is (weakly) continuous, and thus ideal for analyzing corrupted and noisy data. Until recently, however, no kernel methods for dealing with nonlinear data have been proposed via the Wasserstein distance. In this work, we develop a novel method to compute the L2-Wasserstein distance in reproducing kernel Hilbert spaces (RKHS) called kernel L2-Wasserstein distance, which is implemented using the kernel trick. The latter is a general method in machine learning employed to handle data in a nonlinear manner. We evaluate the proposed approach in identifying computed tomography (CT) slices with dental artifacts in head and neck cancer, performing unsupervised hierarchical clustering on the resulting Wasserstein distance matrix that is computed on imaging texture features extracted from each CT slice. We further compare the performance of kernel Wasserstein distance with alternatives including kernel Kullback-Leibler divergence we previously developed. Our experiments show that the kernel approach outperforms classical non-kernel approaches in identifying CT slices with artifacts.
Collapse
Affiliation(s)
- Jung Hun Oh
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, USA.
| | - Maryam Pouryahya
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, USA
| | - Aditi Iyer
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, USA
| | - Aditya P Apte
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, USA
| | - Joseph O Deasy
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, USA
| | - Allen Tannenbaum
- Departments of Computer Science and Applied Mathematics & Statistics, Stony Brook University, USA
| |
Collapse
|
10
|
Mathews JC, Pouryahya M, Moosmüller C, Kevrekidis YG, Deasy JO, Tannenbaum A. Molecular phenotyping using networks, diffusion, and topology: soft tissue sarcoma. Sci Rep 2019; 9:13982. [PMID: 31562358 PMCID: PMC6764992 DOI: 10.1038/s41598-019-50300-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2018] [Accepted: 09/06/2019] [Indexed: 11/24/2022] Open
Abstract
Many biological datasets are high-dimensional yet manifest an underlying order. In this paper, we describe an unsupervised data analysis methodology that operates in the setting of a multivariate dataset and a network which expresses influence between the variables of the given set. The technique involves network geometry employing the Wasserstein distance, global spectral analysis in the form of diffusion maps, and topological data analysis using the Mapper algorithm. The prototypical application is to gene expression profiles obtained from RNA-Seq experiments on a collection of tissue samples, considering only genes whose protein products participate in a known pathway or network of interest. Employing the technique, we discern several coherent states or signatures displayed by the gene expression profiles of the sarcomas in the Cancer Genome Atlas along the TP53 (p53) signaling network. The signatures substantially recover the leiomyosarcoma, dedifferentiated liposarcoma (DDLPS), and synovial sarcoma histological subtype diagnoses, and they also include a new signature defined by activation and inactivation of about a dozen genes, including activation of serine endopeptidase inhibitor SERPINE1 and inactivation of TP53-family tumor suppressor gene TP73.
Collapse
Affiliation(s)
- James C Mathews
- Department of Medical Physics, Memorial Sloan-Kettering Cancer Center, New York, USA.
| | - Maryam Pouryahya
- Department of Medical Physics, Memorial Sloan-Kettering Cancer Center, New York, USA
| | - Caroline Moosmüller
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, USA
| | - Yannis G Kevrekidis
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, USA
| | - Joseph O Deasy
- Department of Medical Physics, Memorial Sloan-Kettering Cancer Center, New York, USA
| | - Allen Tannenbaum
- Departments of Computer Science and Applied Mathematics & Statistics, Stony Brook University, Stony Brook, USA
| |
Collapse
|
11
|
Chen Y, Georgiou TT, Ning L, Tannenbaum A. Matricial Wasserstein-1 Distance. IEEE CONTROL SYSTEMS LETTERS 2017; 1:14-19. [PMID: 29152609 PMCID: PMC5687101 DOI: 10.1109/lcsys.2017.2699319] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
We propose an extension of the Wasserstein 1-metric (W1) for density matrices, matrix-valued density measures, and an unbalanced interpretation of mass transport. We use duality theory and, in particular, a "dual of the dual" formulation of W1. This matrix analogue of the Earth Mover's Distance has several attractive features including ease of computation.
Collapse
Affiliation(s)
- Yongxin Chen
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, NY
| | - Tryphon T Georgiou
- Department of Mechanical and Aerospace Engineering, University of California, Irvine, CA
| | - Lipeng Ning
- Brigham and Women's Hospital (Harvard Medical School), MA
| | - Allen Tannenbaum
- Departments of Computer Science and Applied Mathematics & Statistics, Stony Brook University, NY
| |
Collapse
|