1
|
Zeinali N, Albashayreh A, Fan W, White SG. Symptom-BERT: Enhancing Cancer Symptom Detection in EHR Clinical Notes. J Pain Symptom Manage 2024:S0885-3924(24)00784-X. [PMID: 38789092 DOI: 10.1016/j.jpainsymman.2024.05.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Revised: 05/08/2024] [Accepted: 05/14/2024] [Indexed: 05/26/2024]
Abstract
CONTEXT Extracting cancer symptom documentation allows clinicians to develop highly individualized symptom prediction algorithms to deliver symptom management care. Leveraging advanced language models to detect symptom data in clinical narratives can significantly enhance this process. OBJECTIVE This study uses a pretrained large language model to detect and extract cancer symptoms in clinical notes. METHODS We developed a pretrained language model to identify cancer symptoms in clinical notes based on a clinical corpus from the Enterprise Data Warehouse for Research at a healthcare system in the Midwestern United States. This study was conducted in 4 phases:1 pretraining a Bio-Clinical BERT model on one million unlabeled clinical documents,2 fine-tuning Symptom-BERT for detecting 13 cancer symptom groups within 1112 annotated clinical notes,3 generating 180 synthetic clinical notes using ChatGPT-4 for external validation, and4 comparing the internal and external performance of Symptom-BERT against a non-pretrained version and six other BERT implementations. RESULTS The Symptom-BERT model effectively detected cancer symptoms in clinical notes. It achieved results with a micro-averaged F1-score of 0.933, an AUC of 0.929 internally, and 0.831 and 0.834 externally. Our analysis shows that physical symptoms, like Pruritus, are typically identified with higher performance than psychological symptoms, such as anxiety. CONCLUSION This study underscores the transformative potential of specialized pretraining on domain-specific data in boosting the performance of language models for medical applications. The Symptom-BERT model's exceptional efficacy in detecting cancer symptoms heralds a groundbreaking stride in patient-centered AI technologies, offering a promising path to elevate symptom management and cultivate superior patient self-care outcomes.
Collapse
Affiliation(s)
- Nahid Zeinali
- Department of Computer Science and Informatics (N.Z.), University of Iowa, Iowa, USA.
| | - Alaa Albashayreh
- College of Nursing (A.A., S.G.W.), University of Iowa, Iowa, USA
| | - Weiguo Fan
- Department of Business Analytics (W.F.), University of Iowa, Iowa, USA
| | | |
Collapse
|
2
|
Zhang C, Xu J, Tang R, Yang J, Wang W, Yu X, Shi S. Novel research and future prospects of artificial intelligence in cancer diagnosis and treatment. J Hematol Oncol 2023; 16:114. [PMID: 38012673 PMCID: PMC10680201 DOI: 10.1186/s13045-023-01514-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2023] [Accepted: 11/20/2023] [Indexed: 11/29/2023] Open
Abstract
Research into the potential benefits of artificial intelligence for comprehending the intricate biology of cancer has grown as a result of the widespread use of deep learning and machine learning in the healthcare sector and the availability of highly specialized cancer datasets. Here, we review new artificial intelligence approaches and how they are being used in oncology. We describe how artificial intelligence might be used in the detection, prognosis, and administration of cancer treatments and introduce the use of the latest large language models such as ChatGPT in oncology clinics. We highlight artificial intelligence applications for omics data types, and we offer perspectives on how the various data types might be combined to create decision-support tools. We also evaluate the present constraints and challenges to applying artificial intelligence in precision oncology. Finally, we discuss how current challenges may be surmounted to make artificial intelligence useful in clinical settings in the future.
Collapse
Affiliation(s)
- Chaoyi Zhang
- Department of Pancreatic Surgery, Fudan University Shanghai Cancer Center, No. 270 Dong'An Road, Shanghai, 200032, People's Republic of China
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, 200032, People's Republic of China
- Shanghai Pancreatic Cancer Institute, No. 399 Lingling Road, Shanghai, 200032, People's Republic of China
- Pancreatic Cancer Institute, Fudan University, Shanghai, 200032, People's Republic of China
| | - Jin Xu
- Department of Pancreatic Surgery, Fudan University Shanghai Cancer Center, No. 270 Dong'An Road, Shanghai, 200032, People's Republic of China
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, 200032, People's Republic of China
- Shanghai Pancreatic Cancer Institute, No. 399 Lingling Road, Shanghai, 200032, People's Republic of China
- Pancreatic Cancer Institute, Fudan University, Shanghai, 200032, People's Republic of China
| | - Rong Tang
- Department of Pancreatic Surgery, Fudan University Shanghai Cancer Center, No. 270 Dong'An Road, Shanghai, 200032, People's Republic of China
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, 200032, People's Republic of China
- Shanghai Pancreatic Cancer Institute, No. 399 Lingling Road, Shanghai, 200032, People's Republic of China
- Pancreatic Cancer Institute, Fudan University, Shanghai, 200032, People's Republic of China
| | - Jianhui Yang
- Department of Pancreatic Surgery, Fudan University Shanghai Cancer Center, No. 270 Dong'An Road, Shanghai, 200032, People's Republic of China
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, 200032, People's Republic of China
- Shanghai Pancreatic Cancer Institute, No. 399 Lingling Road, Shanghai, 200032, People's Republic of China
- Pancreatic Cancer Institute, Fudan University, Shanghai, 200032, People's Republic of China
| | - Wei Wang
- Department of Pancreatic Surgery, Fudan University Shanghai Cancer Center, No. 270 Dong'An Road, Shanghai, 200032, People's Republic of China
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, 200032, People's Republic of China
- Shanghai Pancreatic Cancer Institute, No. 399 Lingling Road, Shanghai, 200032, People's Republic of China
- Pancreatic Cancer Institute, Fudan University, Shanghai, 200032, People's Republic of China
| | - Xianjun Yu
- Department of Pancreatic Surgery, Fudan University Shanghai Cancer Center, No. 270 Dong'An Road, Shanghai, 200032, People's Republic of China.
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, 200032, People's Republic of China.
- Shanghai Pancreatic Cancer Institute, No. 399 Lingling Road, Shanghai, 200032, People's Republic of China.
- Pancreatic Cancer Institute, Fudan University, Shanghai, 200032, People's Republic of China.
| | - Si Shi
- Department of Pancreatic Surgery, Fudan University Shanghai Cancer Center, No. 270 Dong'An Road, Shanghai, 200032, People's Republic of China.
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, 200032, People's Republic of China.
- Shanghai Pancreatic Cancer Institute, No. 399 Lingling Road, Shanghai, 200032, People's Republic of China.
- Pancreatic Cancer Institute, Fudan University, Shanghai, 200032, People's Republic of China.
| |
Collapse
|
3
|
Kotevski DP, Vajdic CM, Field M, Smee RI. Inter-hospital variation in data collection, radiotherapy treatment, and survival in patients with head and neck cancer: A multisite study. Radiother Oncol 2023; 188:109843. [PMID: 37543056 DOI: 10.1016/j.radonc.2023.109843] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Revised: 06/14/2023] [Accepted: 07/27/2023] [Indexed: 08/07/2023]
Abstract
BACKGROUND AND PURPOSE Inter-hospital inequalities in head and neck cancer (HNC) survival may exist due to variation in radiotherapy treatment-related factors. This study investigated inter-hospital variation in data collection, primary radiotherapy treatment, and survival in HNC patients from an Australian setting. MATERIALS AND METHODS Data collected in oncology information systems (OIS) from seven Australian hospitals was extracted for 3,182 adults treated with curative radiotherapy, with or without surgery or chemotherapy, for primary, non-metastatic squamous cell carcinoma of the head and neck (2000-2017). Death data was sourced from the National Death Index using record linkage. Multivariable Cox regression was used to assess the association between survival and hospital. RESULTS Inter-hospital variation in data collection, primary radiotherapy dose, and five-year HNC-related death was detected. Completion of eleven fields ranged from 66%-98%. Primary radiotherapy treated Tis-T1N0 glottic and any stage oral cavity and oropharynx cancers received significantly different time-corrected biologically equivalent dose in two gray fractions (EQD2T) by hospital, with observed deviation from Australian radiotherapy guidelines. Increased EQD2T dose was associated with a reduced risk of five-year HNC-related death in all patients and those treated with primary radiotherapy. Hospital, tumour site, and T and N classification were also identified as independent prognostic factors for five-year HNC-related death in all patients treated with radiotherapy. CONCLUSION Unexplained variation exists in HNC-related death in patients treated at Australian hospitals. Available routinely collected data in OIS are insufficient to explain variation in survival. Innovative data collection, extraction, and classification practices are needed to inform clinical practice.
Collapse
Affiliation(s)
- Damian P Kotevski
- Department of Radiation Oncology, Prince of Wales Hospital and Community Health Services, New South Wales, Australia; Prince of Wales Clinical School, Faculty of Medicine, University of New South Wales, New South Wales, Australia.
| | - Claire M Vajdic
- Kirby Institute, Faculty of Medicine, University of New South Wales, New South Wales, Australia
| | - Matthew Field
- South Western Sydney Clinical Campus, School of Clinical Medicine, University of New South Wales, New South Wales, Australia; South Western Sydney Cancer Services, NSW Health, New South Wales, Australia; Ingham Institute for Applied Medical Research, New South Wales, Australia
| | - Robert I Smee
- Department of Radiation Oncology, Prince of Wales Hospital and Community Health Services, New South Wales, Australia; Prince of Wales Clinical School, Faculty of Medicine, University of New South Wales, New South Wales, Australia; Department of Radiation Oncology, Tamworth Base Hospital, Tamworth, New South Wales, Australia
| |
Collapse
|
4
|
Elbatarny L, Do RKG, Gangai N, Ahmed F, Chhabra S, Simpson AL. Applying Natural Language Processing to Single-Report Prediction of Metastatic Disease Response Using the OR-RADS Lexicon. Cancers (Basel) 2023; 15:4909. [PMID: 37894276 PMCID: PMC10605614 DOI: 10.3390/cancers15204909] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2023] [Revised: 09/25/2023] [Accepted: 09/26/2023] [Indexed: 10/29/2023] Open
Abstract
Generating Real World Evidence (RWE) on disease responses from radiological reports is important for understanding cancer treatment effectiveness and developing personalized treatment. A lack of standardization in reporting among radiologists impacts the feasibility of large-scale interpretation of disease response. This study examines the utility of applying natural language processing (NLP) to the large-scale interpretation of disease responses using a standardized oncologic response lexicon (OR-RADS) to facilitate RWE collection. Radiologists annotated 3503 retrospectively collected clinical impressions from radiological reports across several cancer types with one of seven OR-RADS categories. A Bidirectional Encoder Representations from Transformers (BERT) model was trained on this dataset with an 80-20% train/test split to perform multiclass and single-class classification tasks using the OR-RADS. Radiologists also performed the classification to compare human and model performance. The model achieved accuracies from 95 to 99% across all classification tasks, performing better in single-class tasks compared to the multiclass task and producing minimal misclassifications, which pertained mostly to overpredicting the equivocal and mixed OR-RADS labels. Human accuracy ranged from 74 to 93% across all classification tasks, performing better on single-class tasks. This study demonstrates the feasibility of the BERT NLP model in predicting disease response in cancer patients, exceeding human performance, and encourages the use of the standardized OR-RADS lexicon to improve large-scale prediction accuracy.
Collapse
Affiliation(s)
- Lydia Elbatarny
- School of Computing, Queen’s University, Kingston, ON K7L 2N8, Canada;
| | - Richard K. G. Do
- Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA; (N.G.); (F.A.); (S.C.)
| | - Natalie Gangai
- Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA; (N.G.); (F.A.); (S.C.)
| | - Firas Ahmed
- Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA; (N.G.); (F.A.); (S.C.)
| | - Shalini Chhabra
- Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA; (N.G.); (F.A.); (S.C.)
| | - Amber L. Simpson
- School of Computing, Queen’s University, Kingston, ON K7L 2N8, Canada;
- Department of Biomedical and Molecular Sciences, Queen’s University, Kingston, ON K7L 2V7, Canada
| |
Collapse
|
5
|
Tan RSYC, Lin Q, Low GH, Lin R, Goh TC, Chang CCE, Lee FF, Chan WY, Tan WC, Tey HJ, Leong FL, Tan HQ, Nei WL, Chay WY, Tai DWM, Lai GGY, Cheng LTE, Wong FY, Chua MCH, Chua MLK, Tan DSW, Thng CH, Tan IBH, Ng HT. Inferring cancer disease response from radiology reports using large language models with data augmentation and prompting. J Am Med Inform Assoc 2023; 30:1657-1664. [PMID: 37451682 PMCID: PMC10531105 DOI: 10.1093/jamia/ocad133] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2023] [Revised: 06/27/2023] [Accepted: 07/04/2023] [Indexed: 07/18/2023] Open
Abstract
OBJECTIVE To assess large language models on their ability to accurately infer cancer disease response from free-text radiology reports. MATERIALS AND METHODS We assembled 10 602 computed tomography reports from cancer patients seen at a single institution. All reports were classified into: no evidence of disease, partial response, stable disease, or progressive disease. We applied transformer models, a bidirectional long short-term memory model, a convolutional neural network model, and conventional machine learning methods to this task. Data augmentation using sentence permutation with consistency loss as well as prompt-based fine-tuning were used on the best-performing models. Models were validated on a hold-out test set and an external validation set based on Response Evaluation Criteria in Solid Tumors (RECIST) classifications. RESULTS The best-performing model was the GatorTron transformer which achieved an accuracy of 0.8916 on the test set and 0.8919 on the RECIST validation set. Data augmentation further improved the accuracy to 0.8976. Prompt-based fine-tuning did not further improve accuracy but was able to reduce the number of training reports to 500 while still achieving good performance. DISCUSSION These models could be used by researchers to derive progression-free survival in large datasets. It may also serve as a decision support tool by providing clinicians an automated second opinion of disease response. CONCLUSIONS Large clinical language models demonstrate potential to infer cancer disease response from radiology reports at scale. Data augmentation techniques are useful to further improve performance. Prompt-based fine-tuning can significantly reduce the size of the training dataset.
Collapse
Affiliation(s)
- Ryan Shea Ying Cong Tan
- Division of Medical Oncology, National Cancer Centre Singapore, Singapore
- Duke-NUS Medical School, Singapore
| | - Qian Lin
- Department of Computer Science, National University of Singapore, Singapore
| | - Guat Hwa Low
- Division of Medical Oncology, National Cancer Centre Singapore, Singapore
| | - Ruixi Lin
- Department of Computer Science, National University of Singapore, Singapore
| | - Tzer Chew Goh
- Institute of Systems Science, National University of Singapore, Singapore
| | | | - Fung Fung Lee
- Institute of Systems Science, National University of Singapore, Singapore
| | - Wei Yin Chan
- Institute of Systems Science, National University of Singapore, Singapore
| | - Wei Chong Tan
- Division of Medical Oncology, National Cancer Centre Singapore, Singapore
- Duke-NUS Medical School, Singapore
| | - Han Jieh Tey
- Division of Medical Oncology, National Cancer Centre Singapore, Singapore
| | - Fun Loon Leong
- Division of Medical Oncology, National Cancer Centre Singapore, Singapore
| | - Hong Qi Tan
- Division of Radiation Oncology, National Cancer Centre Singapore, Singapore
| | - Wen Long Nei
- Division of Radiation Oncology, National Cancer Centre Singapore, Singapore
| | - Wen Yee Chay
- Division of Medical Oncology, National Cancer Centre Singapore, Singapore
- Duke-NUS Medical School, Singapore
| | - David Wai Meng Tai
- Division of Medical Oncology, National Cancer Centre Singapore, Singapore
- Duke-NUS Medical School, Singapore
| | - Gillianne Geet Yi Lai
- Division of Medical Oncology, National Cancer Centre Singapore, Singapore
- Duke-NUS Medical School, Singapore
| | - Lionel Tim-Ee Cheng
- Duke-NUS Medical School, Singapore
- Department of Diagnostic Radiology, Singapore General Hospital, Singapore
| | - Fuh Yong Wong
- Division of Radiation Oncology, National Cancer Centre Singapore, Singapore
| | | | - Melvin Lee Kiang Chua
- Duke-NUS Medical School, Singapore
- Division of Radiation Oncology, National Cancer Centre Singapore, Singapore
- Data and Computational Science Core, National Cancer Centre Singapore, Singapore
| | - Daniel Shao Weng Tan
- Division of Medical Oncology, National Cancer Centre Singapore, Singapore
- Division of Clinical Trials and Epidemiological Sciences, National Cancer Centre Singapore, Singapore
| | - Choon Hua Thng
- Duke-NUS Medical School, Singapore
- Division of Oncologic Imaging, National Cancer Centre Singapore, Singapore
| | - Iain Bee Huat Tan
- Division of Medical Oncology, National Cancer Centre Singapore, Singapore
- Duke-NUS Medical School, Singapore
- Data and Computational Science Core, National Cancer Centre Singapore, Singapore
| | - Hwee Tou Ng
- Department of Computer Science, National University of Singapore, Singapore
| |
Collapse
|
6
|
Elmarakeby HA, Trukhanov PS, Arroyo VM, Riaz IB, Schrag D, Van Allen EM, Kehl KL. Empirical evaluation of language modeling to ascertain cancer outcomes from clinical text reports. BMC Bioinformatics 2023; 24:328. [PMID: 37658330 PMCID: PMC10474750 DOI: 10.1186/s12859-023-05439-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2023] [Accepted: 08/07/2023] [Indexed: 09/03/2023] Open
Abstract
BACKGROUND Longitudinal data on key cancer outcomes for clinical research, such as response to treatment and disease progression, are not captured in standard cancer registry reporting. Manual extraction of such outcomes from unstructured electronic health records is a slow, resource-intensive process. Natural language processing (NLP) methods can accelerate outcome annotation, but they require substantial labeled data. Transfer learning based on language modeling, particularly using the Transformer architecture, has achieved improvements in NLP performance. However, there has been no systematic evaluation of NLP model training strategies on the extraction of cancer outcomes from unstructured text. RESULTS We evaluated the performance of nine NLP models at the two tasks of identifying cancer response and cancer progression within imaging reports at a single academic center among patients with non-small cell lung cancer. We trained the classification models under different conditions, including training sample size, classification architecture, and language model pre-training. The training involved a labeled dataset of 14,218 imaging reports for 1112 patients with lung cancer. A subset of models was based on a pre-trained language model, DFCI-ImagingBERT, created by further pre-training a BERT-based model using an unlabeled dataset of 662,579 reports from 27,483 patients with cancer from our center. A classifier based on our DFCI-ImagingBERT, trained on more than 200 patients, achieved the best results in most experiments; however, these results were marginally better than simpler "bag of words" or convolutional neural network models. CONCLUSION When developing AI models to extract outcomes from imaging reports for clinical cancer research, if computational resources are plentiful but labeled training data are limited, large language models can be used for zero- or few-shot learning to achieve reasonable performance. When computational resources are more limited but labeled training data are readily available, even simple machine learning architectures can achieve good performance for such tasks.
Collapse
Affiliation(s)
- Haitham A Elmarakeby
- Dana-Farber Cancer Institute, Boston, MA, USA.
- Al-Azhar University, Cairo, Egypt.
- Harvard Medical School, Boston, MA, USA.
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| | | | | | - Irbaz Bin Riaz
- Dana-Farber Cancer Institute, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
- Mayo Clinic, Rochester, MN, USA
| | - Deborah Schrag
- Memorial-Sloan Kettering Cancer Center, New York, NY, USA
| | - Eliezer M Van Allen
- Dana-Farber Cancer Institute, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Kenneth L Kehl
- Dana-Farber Cancer Institute, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
| |
Collapse
|
7
|
Solarte-Pabón O, Montenegro O, García-Barragán A, Torrente M, Provencio M, Menasalvas E, Robles V. Transformers for extracting breast cancer information from Spanish clinical narratives. Artif Intell Med 2023; 143:102625. [PMID: 37673566 DOI: 10.1016/j.artmed.2023.102625] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Revised: 05/11/2023] [Accepted: 07/08/2023] [Indexed: 09/08/2023]
Abstract
The wide adoption of electronic health records (EHRs) offers immense potential as a source of support for clinical research. However, previous studies focused on extracting only a limited set of medical concepts to support information extraction in the cancer domain for the Spanish language. Building on the success of deep learning for processing natural language texts, this paper proposes a transformer-based approach to extract named entities from breast cancer clinical notes written in Spanish and compares several language models. To facilitate this approach, a schema for annotating clinical notes with breast cancer concepts is presented, and a corpus for breast cancer is developed. Results indicate that both BERT-based and RoBERTa-based language models demonstrate competitive performance in clinical Named Entity Recognition (NER). Specifically, BETO and multilingual BERT achieve F-scores of 93.71% and 94.63%, respectively. Additionally, RoBERTa Biomedical attains an F-score of 95.01%, while RoBERTa BNE achieves an F-score of 94.54%. The findings suggest that transformers can feasibly extract information in the clinical domain in the Spanish language, with the use of models trained on biomedical texts contributing to enhanced results. The proposed approach takes advantage of transfer learning techniques by fine-tuning language models to automatically represent text features and avoiding the time-consuming feature engineering process.
Collapse
Affiliation(s)
- Oswaldo Solarte-Pabón
- Centro de Tecnología Biomédica, Universidad Politécnica de Madrid, Madrid, Spain; Escuela de Ingeniería de Sistemas, Universidad del Valle, Cali, Colombia.
| | - Orlando Montenegro
- Escuela de Ingeniería de Sistemas, Universidad del Valle, Cali, Colombia
| | | | - Maria Torrente
- Hospital Universitario Puerta de Hierro de Madrid, Madrid, Spain
| | | | - Ernestina Menasalvas
- Centro de Tecnología Biomédica, Universidad Politécnica de Madrid, Madrid, Spain
| | - Víctor Robles
- Centro de Tecnología Biomédica, Universidad Politécnica de Madrid, Madrid, Spain
| |
Collapse
|
8
|
Moon I, LoPiccolo J, Baca SC, Sholl LM, Kehl KL, Hassett MJ, Liu D, Schrag D, Gusev A. Machine learning for genetics-based classification and treatment response prediction in cancer of unknown primary. Nat Med 2023; 29:2057-2067. [PMID: 37550415 DOI: 10.1038/s41591-023-02482-6] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Accepted: 06/30/2023] [Indexed: 08/09/2023]
Abstract
Cancer of unknown primary (CUP) is a type of cancer that cannot be traced back to its primary site and accounts for 3-5% of all cancers. Established targeted therapies are lacking for CUP, leading to generally poor outcomes. We developed OncoNPC, a machine-learning classifier trained on targeted next-generation sequencing (NGS) data from 36,445 tumors across 22 cancer types from three institutions. Oncology NGS-based primary cancer-type classifier (OncoNPC) achieved a weighted F1 score of 0.942 for high confidence predictions ([Formula: see text]) on held-out tumor samples, which made up 65.2% of all the held-out samples. When applied to 971 CUP tumors collected at the Dana-Farber Cancer Institute, OncoNPC predicted primary cancer types with high confidence in 41.2% of the tumors. OncoNPC also identified CUP subgroups with significantly higher polygenic germline risk for the predicted cancer types and with significantly different survival outcomes. Notably, patients with CUP who received first palliative intent treatments concordant with their OncoNPC-predicted cancers had significantly better outcomes (hazard ratio (HR) = 0.348; 95% confidence interval (CI) = 0.210-0.570; P = [Formula: see text]). Furthermore, OncoNPC enabled a 2.2-fold increase in patients with CUP who could have received genomically guided therapies. OncoNPC thus provides evidence of distinct CUP subgroups and offers the potential for clinical decision support for managing patients with CUP.
Collapse
Affiliation(s)
- Intae Moon
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, USA
- Division of Population Sciences, Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA, USA
| | - Jaclyn LoPiccolo
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Sylvan C Baca
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
- Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Lynette M Sholl
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Kenneth L Kehl
- Division of Population Sciences, Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA, USA
| | - Michael J Hassett
- Division of Population Sciences, Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA, USA
| | - David Liu
- Division of Population Sciences, Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA, USA
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
- The Broad Institute of MIT & Harvard, Cambridge, MA, USA
| | - Deborah Schrag
- Memorial Sloan Kettering Cancer Center, New York City, NY, USA
| | - Alexander Gusev
- Division of Population Sciences, Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA, USA.
- The Broad Institute of MIT & Harvard, Cambridge, MA, USA.
- Division of Genetics, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
9
|
Mottin L, Goldman JP, Jäggli C, Achermann R, Gobeill J, Knafou J, Ehrsam J, Wicky A, Gérard CL, Schwenk T, Charrier M, Tsantoulis P, Lovis C, Leichtle A, Kiessling MK, Michielin O, Pradervand S, Foufi V, Ruch P. Multilingual RECIST classification of radiology reports using supervised learning. Front Digit Health 2023; 5:1195017. [PMID: 37388252 PMCID: PMC10303934 DOI: 10.3389/fdgth.2023.1195017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Accepted: 06/05/2023] [Indexed: 07/01/2023] Open
Abstract
Objectives The objective of this study is the exploration of Artificial Intelligence and Natural Language Processing techniques to support the automatic assignment of the four Response Evaluation Criteria in Solid Tumors (RECIST) scales based on radiology reports. We also aim at evaluating how languages and institutional specificities of Swiss teaching hospitals are likely to affect the quality of the classification in French and German languages. Methods In our approach, 7 machine learning methods were evaluated to establish a strong baseline. Then, robust models were built, fine-tuned according to the language (French and German), and compared with the expert annotation. Results The best strategies yield average F1-scores of 90% and 86% respectively for the 2-classes (Progressive/Non-progressive) and the 4-classes (Progressive Disease, Stable Disease, Partial Response, Complete Response) RECIST classification tasks. Conclusions These results are competitive with the manual labeling as measured by Matthew's correlation coefficient and Cohen's Kappa (79% and 76%). On this basis, we confirm the capacity of specific models to generalize on new unseen data and we assess the impact of using Pre-trained Language Models (PLMs) on the accuracy of the classifiers.
Collapse
Affiliation(s)
- Luc Mottin
- HES-SO\HEG Genève, Information Sciences, Geneva, Switzerland
- SIB Text Mining Group, Swiss Institute of Bioinformatics, Geneva, Switzerland
| | - Jean-Philippe Goldman
- Division of Medical Information Sciences, University Hospitals of Geneva, Geneva, Switzerland
| | - Christoph Jäggli
- Inselspital – Bern University Hospital and University of Bern, Bern, Switzerland
| | - Rita Achermann
- Department of Radiology, Clinic of Radiology & Nuclear Medicine, University Hospital Basel, University of Basel, Basel, Switzerland
| | - Julien Gobeill
- HES-SO\HEG Genève, Information Sciences, Geneva, Switzerland
- SIB Text Mining Group, Swiss Institute of Bioinformatics, Geneva, Switzerland
| | - Julien Knafou
- HES-SO\HEG Genève, Information Sciences, Geneva, Switzerland
- SIB Text Mining Group, Swiss Institute of Bioinformatics, Geneva, Switzerland
| | - Julien Ehrsam
- Department of Radiology and Medical Informatics, University of Geneva, Geneva, Switzerland
| | - Alexandre Wicky
- Precision Oncology Center, Oncology Department, Centre Hospitalier Universitaire Vaudois – CHUV, Lausanne, Switzerland
| | - Camille L. Gérard
- Precision Oncology Center, Oncology Department, Centre Hospitalier Universitaire Vaudois – CHUV, Lausanne, Switzerland
| | - Tanja Schwenk
- Department of Oncology, Kantonsspital Aarau, Aarau, Switzerland
| | - Mélinda Charrier
- Division of Medical Information Sciences, University Hospitals of Geneva, Geneva, Switzerland
| | - Petros Tsantoulis
- Division of Medical Information Sciences, University Hospitals of Geneva, Geneva, Switzerland
- Department of Radiology and Medical Informatics, University of Geneva, Geneva, Switzerland
| | - Christian Lovis
- Division of Medical Information Sciences, University Hospitals of Geneva, Geneva, Switzerland
- Department of Radiology and Medical Informatics, University of Geneva, Geneva, Switzerland
| | - Alexander Leichtle
- Inselspital – Bern University Hospital and University of Bern, Bern, Switzerland
| | - Michael K. Kiessling
- Department of Medical Oncology and Hematology, University Hospital Zurich, Zurich, Switzerland
| | - Olivier Michielin
- Precision Oncology Center, Oncology Department, Centre Hospitalier Universitaire Vaudois – CHUV, Lausanne, Switzerland
| | - Sylvain Pradervand
- Precision Oncology Center, Oncology Department, Centre Hospitalier Universitaire Vaudois – CHUV, Lausanne, Switzerland
| | - Vasiliki Foufi
- Division of Medical Information Sciences, University Hospitals of Geneva, Geneva, Switzerland
| | - Patrick Ruch
- HES-SO\HEG Genève, Information Sciences, Geneva, Switzerland
- SIB Text Mining Group, Swiss Institute of Bioinformatics, Geneva, Switzerland
| |
Collapse
|
10
|
Khan MS, Usman MS, Talha KM, Van Spall HGC, Greene SJ, Vaduganathan M, Khan SS, Mills NL, Ali ZA, Mentz RJ, Fonarow GC, Rao SV, Spertus JA, Roe MT, Anker SD, James SK, Butler J, McGuire DK. Leveraging electronic health records to streamline the conduct of cardiovascular clinical trials. Eur Heart J 2023; 44:1890-1909. [PMID: 37098746 DOI: 10.1093/eurheartj/ehad171] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/17/2022] [Revised: 02/05/2023] [Accepted: 03/07/2023] [Indexed: 04/27/2023] Open
Abstract
Conventional randomized controlled trials (RCTs) can be expensive, time intensive, and complex to conduct. Trial recruitment, participation, and data collection can burden participants and research personnel. In the past two decades, there have been rapid technological advances and an exponential growth in digitized healthcare data. Embedding RCTs, including cardiovascular outcome trials, into electronic health record systems or registries may streamline screening, consent, randomization, follow-up visits, and outcome adjudication. Moreover, wearable sensors (i.e. health and fitness trackers) provide an opportunity to collect data on cardiovascular health and risk factors in unprecedented detail and scale, while growing internet connectivity supports the collection of patient-reported outcomes. There is a pressing need to develop robust mechanisms that facilitate data capture from diverse databases and guidance to standardize data definitions. Importantly, the data collection infrastructure should be reusable to support multiple cardiovascular RCTs over time. Systems, processes, and policies will need to have sufficient flexibility to allow interoperability between different sources of data acquisition. Clinical research guidelines, ethics oversight, and regulatory requirements also need to evolve. This review highlights recent progress towards the use of routinely generated data to conduct RCTs and discusses potential solutions for ongoing barriers. There is a particular focus on methods to utilize routinely generated data for trials while complying with regional data protection laws. The discussion is supported with examples of cardiovascular outcome trials that have successfully leveraged the electronic health record, web-enabled devices or administrative databases to conduct randomized trials.
Collapse
Affiliation(s)
- Muhammad Shahzeb Khan
- Division of Cardiology, Duke University School of Medicine, 2301 Erwin Rd., Durham, NC 27705, USA
| | - Muhammad Shariq Usman
- Department of Medicine, University of Mississippi Medical Center, 2500 N State St, Jackson, MS 39216, USA
| | - Khawaja M Talha
- Department of Medicine, University of Mississippi Medical Center, 2500 N State St, Jackson, MS 39216, USA
| | - Harriette G C Van Spall
- Department of Medicine, McMaster University, Hamilton, ON, Canada
- Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, ON, Canada
- Population Health Research Institute, Hamilton, ON, Canada
| | - Stephen J Greene
- Division of Cardiology, Duke University School of Medicine, 2301 Erwin Rd., Durham, NC 27705, USA
- Duke Clinical Research Institute, Durham, NC, USA
| | - Muthiah Vaduganathan
- Cardiovascular Division, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Sadiya S Khan
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | - Nicholas L Mills
- BHF Centre for Cardiovascular Science, University of Edinburgh, Chancellors Building, Royal Infirmary of Edinburgh, Edinburgh, Scotland, UK
- Usher Institute, University of Edinburgh, Edinburgh, Scotland, UK
| | - Ziad A Ali
- DeMatteis Cardiovascular Institute, St Francis Hospital and Heart Center, Roslyn, NY, USA
| | - Robert J Mentz
- Division of Cardiology, Duke University School of Medicine, 2301 Erwin Rd., Durham, NC 27705, USA
- Duke Clinical Research Institute, Durham, NC, USA
| | - Gregg C Fonarow
- Division of Cardiology, University of California Los Angeles, Los Angeles, CA, USA
| | - Sunil V Rao
- Division of Cardiology, New York University Langone Health System, New York, NY, USA
| | - John A Spertus
- Department of Cardiology, Saint Luke's Mid America Heart Institute, Kansas City, MO, USA
- Kansas City's Healthcare Institute for Innovations in Quality, University of Missouri, Kansas, MO, USA
| | - Matthew T Roe
- Division of Cardiology, Duke University School of Medicine, 2301 Erwin Rd., Durham, NC 27705, USA
- Duke Clinical Research Institute, Durham, NC, USA
| | - Stefan D Anker
- Department of Cardiology (CVK), Berlin Institute of Health Center for Regenerative Therapies (BCRT), and German Centre for Cardiovascular Research (DZHK) Partner Site Berlin, Charité Universitätsmedizin, Berlin, Germany
| | - Stefan K James
- Department of Medical Sciences, Scientific Director UCR, Uppsala University, Uppsala, Uppland, Sweden
| | - Javed Butler
- Department of Medicine, University of Mississippi Medical Center, 2500 N State St, Jackson, MS 39216, USA
- Baylor Scott & White Research Institute, Dallas, TX, USA
| | - Darren K McGuire
- Division of Cardiology, Department of Internal Medicine, UT Southwestern Medical Center and Parkland Health and Hospital System, Dallas, TX, USA
| |
Collapse
|
11
|
Rios-Doria E, Momeni-Boroujeni A, Friedman CF, Selenica P, Zhou Q, Wu M, Marra A, Leitao MM, Iasonos A, Alektiar KM, Sonoda Y, Makker V, Jewell E, Liu Y, Chi D, Zamarin D, Abu-Rustum NR, Aghajanian C, Mueller JJ, Ellenson LH, Weigelt B. Integration of clinical sequencing and immunohistochemistry for the molecular classification of endometrial carcinoma. Gynecol Oncol 2023; 174:262-272. [PMID: 37245486 PMCID: PMC10402916 DOI: 10.1016/j.ygyno.2023.05.059] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Accepted: 05/16/2023] [Indexed: 05/30/2023]
Abstract
PURPOSE Using next generation sequencing (NGS), The Cancer Genome Atlas (TCGA) found that endometrial carcinomas (ECs) fall under one of four molecular subtypes, and a POLE mutation status, mismatch repair (MMR) and p53 immunohistochemistry (IHC)-based surrogate has been developed. We sought to retrospectively classify and characterize a large series of unselected ECs that were prospectively subjected to clinical sequencing by utilizing clinical molecular and IHC data. EXPERIMENTAL DESIGN All patients with EC with clinical tumor-normal MSK-IMPACT NGS from 2014 to 2020 (n = 2115) were classified by integrating molecular data (i.e., POLE mutation, TP53 mutation, MSIsensor score) and MMR and p53 IHC results. Survival analysis was performed for primary EC patients with upfront surgery at our institution. RESULTS Utilizing our integrated approach, significantly more ECs were molecularly classified (1834/2115, 87%) as compared to the surrogate (1387/2115, 66%, p < 0.001), with an almost perfect agreement for classifiable cases (Kappa 0.962, 95% CI 0.949-0.975). Discrepancies were primarily due to TP53 mutations in p53-IHC-normal ECs. Of the 1834 ECs, most were of copy number (CN)-high molecular subtype (40%), followed by CN-low (32%), MSI-high (23%) and POLE (5%). Histologic and genomic variability was present amongst all molecular subtypes. Molecular classification was prognostic in early- and advanced-stage disease, including early-stage endometrioid EC. CONCLUSIONS The integration of clinical NGS and IHC data allows for an algorithmic approach to molecularly classifying newly diagnosed EC, while overcoming issues of IHC-based genetic alteration detection. Such integrated approach will be important moving forward given the prognostic and potentially predictive information afforded by this classification.
Collapse
Affiliation(s)
- Eric Rios-Doria
- Gynecology Service, Department of Surgery, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Amir Momeni-Boroujeni
- Department of Pathology and Laboratory Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Claire F Friedman
- Gynecologic Medical Oncology, Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, USA; Department of Medicine, Weil Cornell Medical College, New York, NY, USA
| | - Pier Selenica
- Department of Pathology and Laboratory Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Qin Zhou
- Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Michelle Wu
- Gynecology Service, Department of Surgery, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Antonio Marra
- Department of Pathology and Laboratory Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Mario M Leitao
- Gynecology Service, Department of Surgery, Memorial Sloan Kettering Cancer Center, New York, NY, USA; Department of Surgery, Weil Cornell Medical College, New York, NY, USA
| | - Alexia Iasonos
- Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Kaled M Alektiar
- Department of Radiation Oncology, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Yukio Sonoda
- Gynecology Service, Department of Surgery, Memorial Sloan Kettering Cancer Center, New York, NY, USA; Department of Surgery, Weil Cornell Medical College, New York, NY, USA
| | - Vicky Makker
- Gynecologic Medical Oncology, Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, USA; Department of Medicine, Weil Cornell Medical College, New York, NY, USA
| | - Elizabeth Jewell
- Gynecology Service, Department of Surgery, Memorial Sloan Kettering Cancer Center, New York, NY, USA; Department of Surgery, Weil Cornell Medical College, New York, NY, USA
| | - Ying Liu
- Gynecologic Medical Oncology, Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, USA; Department of Medicine, Weil Cornell Medical College, New York, NY, USA
| | - Dennis Chi
- Gynecology Service, Department of Surgery, Memorial Sloan Kettering Cancer Center, New York, NY, USA; Department of Surgery, Weil Cornell Medical College, New York, NY, USA
| | - Dimitry Zamarin
- Gynecologic Medical Oncology, Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, USA; Department of Medicine, Weil Cornell Medical College, New York, NY, USA
| | - Nadeem R Abu-Rustum
- Gynecology Service, Department of Surgery, Memorial Sloan Kettering Cancer Center, New York, NY, USA; Department of Surgery, Weil Cornell Medical College, New York, NY, USA
| | - Carol Aghajanian
- Gynecologic Medical Oncology, Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, USA; Department of Medicine, Weil Cornell Medical College, New York, NY, USA
| | - Jennifer J Mueller
- Gynecology Service, Department of Surgery, Memorial Sloan Kettering Cancer Center, New York, NY, USA; Department of Surgery, Weil Cornell Medical College, New York, NY, USA
| | - Lora H Ellenson
- Department of Pathology and Laboratory Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Britta Weigelt
- Department of Pathology and Laboratory Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
| |
Collapse
|
12
|
Kroenke K, Lam V, Ruddy KJ, Pachman DR, Herrin J, Rahman PA, Griffin JM, Cheville AL. Prevalence, Severity, and Co-Occurrence of SPPADE Symptoms in 31,866 Patients With Cancer. J Pain Symptom Manage 2023; 65:367-377. [PMID: 36738867 PMCID: PMC10106386 DOI: 10.1016/j.jpainsymman.2023.01.020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Revised: 01/21/2023] [Accepted: 01/24/2023] [Indexed: 02/05/2023]
Abstract
OBJECTIVES To examine the prevalence, severity, and co-occurrence of SPPADE symptoms as well as their association with cancer type and patient characteristics. BACKGROUND The SPPADE symptoms (sleep disturbance, pain, physical function impairment, anxiety, depression, and low energy /fatigue) are prevalent, co-occurring, and undertreated in oncology and other clinical populations. METHODS Baseline SPPADE symptom data were analyzed from the E2C2 study, a stepped wedge pragmatic, population-level, cluster randomized clinical trial designed to evaluate a guideline-informed symptom management model targeting the six SPPADE symptoms. Symptom prevalence and severity were measured with a 0-10 numeric rating (NRS) scale for each of the six symptoms. Prevalence of severe (NRS ≥ 7) and potential clinically relevant (NRS ≥ 5) symptoms as well as co-occurrence of clinical symptoms were determined. Distribution-based methods were used to estimate the minimally important difference (MID). Associations of cancer type and patient characteristics with a SPPADE composite score were analyzed. RESULTS A total of 31,886 patients were assessed for SPPADE symptoms prior to, during, or soon after an outpatient medical oncology encounter. The proportion of patients with a potential clinically relevant symptom ranged from 17.5% for depression to 33.4% for fatigue. Co-occurrence of symptoms was high, with the proportion of patients with three or more additional clinically relevant symptoms ranging from 45.2% for fatigue to 68.6% for depression. The summed SPPADE composite score demonstrated good internal reliability (Cronbach's alpha of 0.86), with preliminary MID estimates of 4.1-4.3. Symptom burden differed across several types of cancer but was generally similar across most sociodemographic characteristics. CONCLUSION The high prevalence and co-occurrence of SPPADE symptoms in patients with all types of cancer warrants clinical approaches that optimize detection and management.
Collapse
Affiliation(s)
- Kurt Kroenke
- Indiana University School of Medicine (K.K.), Indianapolis, Indiana, USA; Regenstrief Institute, Inc. (K.K.), Indianapolis, Indiana, USA.
| | - Veronica Lam
- Department of Physical Medicine and Rehabilitation (V.L., A.L.C.), Mayo Clinic, Rochester, Minnesota, USA
| | - Kathryn J Ruddy
- Division of Medical Oncology (K.J.R.), Mayo Clinic, Rochester, Minnesota, USA
| | - Deirdre R Pachman
- Division of Community Internal Medicine, Geriatrics, and Palliative Care (D.R.P.), Mayo Clinic, Rochester, Minnesota, USA
| | - Jeph Herrin
- Yale University School of Medicine (J.H.), New Haven, Connecticut, USA
| | - Parvez A Rahman
- Robert D. and Patricia E. Kern Center for the Science of Health Care Delivery (P.A.R., J.M.G., A.L.C.), Mayo Clinic, Rochester, Minnesota, USA
| | - Joan M Griffin
- Robert D. and Patricia E. Kern Center for the Science of Health Care Delivery (P.A.R., J.M.G., A.L.C.), Mayo Clinic, Rochester, Minnesota, USA; Division of Health Care Delivery Research (J.M.G.), Mayo Clinic, Rochester, Minnesota, USA
| | - Andrea L Cheville
- Department of Physical Medicine and Rehabilitation (V.L., A.L.C.), Mayo Clinic, Rochester, Minnesota, USA; Robert D. and Patricia E. Kern Center for the Science of Health Care Delivery (P.A.R., J.M.G., A.L.C.), Mayo Clinic, Rochester, Minnesota, USA
| |
Collapse
|
13
|
Sarmet M, Kabani A, Coelho L, Dos Reis SS, Zeredo JL, Mehta AK. The use of natural language processing in palliative care research: A scoping review. Palliat Med 2023; 37:275-290. [PMID: 36495082 DOI: 10.1177/02692163221141969] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
BACKGROUND Natural language processing has been increasingly used in palliative care research over the last 5 years for its versatility and accuracy. AIM To evaluate and characterize natural language processing use in palliative care research, including the most commonly used natural language processing software and computational methods, data sources, trends in natural language processing use over time, and palliative care topics addressed. DESIGN A scoping review using the framework by Arksey and O'Malley and the updated recommendations proposed by Levac et al. was conducted. SOURCES PubMed, Web of Science, Embase, Scopus, and IEEE Xplore databases were searched for palliative care studies that utilized natural language processing tools. Data on study characteristics and natural language processing instruments used were collected and relevant palliative care topics were identified. RESULTS 197 relevant references were identified. Of these, 82 were included after full-text review. Studies were published in 48 different journals from 2007 to 2022. The average sample size was 21,541 (median 435). Thirty-two different natural language processing software and 33 machine-learning methods were identified. Nine main sources for data processing and 15 main palliative care topics across the included studies were identified. The most frequent topic was mortality and prognosis prediction. We also identified a trend where natural language processing was frequently used in analyzing clinical serious illness conversations extracted from audio recordings. CONCLUSIONS We found 82 papers on palliative care using natural language processing methods for a wide-range of topics and sources of data that could expand the use of this methodology. We encourage researchers to consider incorporating this cutting-edge research methodology in future studies to improve published palliative care data.
Collapse
Affiliation(s)
- Max Sarmet
- Tertiary Referral Center of Neuromuscular Diseases, Hospital de Apoio de Brasília, Brazil.,Graduate Department of Health Science and Technology, University of Brasília, Brazil
| | - Aamna Kabani
- Johns Hopkins University, School of Medicine, USA
| | - Luis Coelho
- Center of Innovation in Engineering and Industrial Technology, Polytechnic of Porto - School of Engineering (ISEP), Portugal
| | - Sara Seabra Dos Reis
- Center of Innovation in Engineering and Industrial Technology, Polytechnic of Porto - School of Engineering (ISEP), Portugal
| | - Jorge L Zeredo
- Graduate Department of Health Science and Technology, University of Brasília, Brazil
| | - Ambereen K Mehta
- Palliative Care Program, Division of General Internal Medicine, Johns Hopkins Bayview Medical Center, Johns Hopkins University, School of Medicine, USA
| |
Collapse
|
14
|
Kotevski DP, Smee RI, Vajdic CM, Field M. Empirical comparison of routinely collected electronic health record data for head and neck cancer-specific survival in machine-learnt prognostic models. Head Neck 2023; 45:365-379. [PMID: 36369773 PMCID: PMC10100433 DOI: 10.1002/hed.27241] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Revised: 09/21/2022] [Accepted: 11/02/2022] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND Knowledge of the prognostic factors and performance of machine learning predictive models for 2-year cancer-specific survival (CSS) is limited in the head and neck cancer (HNC) population. METHODS Data from our facilities' oncology information system (OIS) collected for routine practice (OIS dataset, n = 430 patients) and research purposes (research dataset, n = 529 patients) were extracted on adults diagnosed between 2000 and 2017 with squamous cell carcinoma of the head and neck. RESULTS Machine learning demonstrated excellent performance (area under the curve, AUC) in the whole cohort (AUC = 0.97, research dataset), larynx cohort (AUC = 0.98, both datasets), and oropharynx cohort (AUC = 0.99, both datasets). Tumor site and T classification were identified as predictors of 2-year CSS in both datasets. Hypothyroidism and fitness for operation were further identified in the research dataset. CONCLUSIONS Datasets extracted from an OIS for routine clinical practice and research purposes demonstrated high utility for informing 2-year head and neck CSS.
Collapse
Affiliation(s)
- Damian P Kotevski
- Department of Radiation Oncology, Prince of Wales Hospital and Community Health Services, Sydney, New South Wales, Australia.,Prince of Wales Clinical School, Faculty of Medicine, University of New South Wales, Sydney, New South Wales, Australia
| | - Robert I Smee
- Department of Radiation Oncology, Prince of Wales Hospital and Community Health Services, Sydney, New South Wales, Australia.,Prince of Wales Clinical School, Faculty of Medicine, University of New South Wales, Sydney, New South Wales, Australia.,Department of Radiation Oncology, Tamworth Base Hospital, Tamworth, New South Wales, Australia
| | - Claire M Vajdic
- Centre for Big Data Research in Health, Faculty of Medicine, University of New South Wales, Sydney, New South Wales, Australia.,Kirby Institute, Faculty of Medicine, University of New South Wales, Sydney, New South Wales, Australia
| | - Matthew Field
- South Western Sydney Clinical School, Faculty of Medicine, University of New South Wales, Sydney, New South Wales, Australia.,Ingham Institute for Applied Medical Research, Sydney, New South Wales, Australia
| |
Collapse
|
15
|
Nunez JJ, Leung B, Ho C, Bates AT, Ng RT. Predicting the Survival of Patients With Cancer From Their Initial Oncology Consultation Document Using Natural Language Processing. JAMA Netw Open 2023; 6:e230813. [PMID: 36848085 PMCID: PMC9972192 DOI: 10.1001/jamanetworkopen.2023.0813] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 03/01/2023] Open
Abstract
IMPORTANCE Predicting short- and long-term survival of patients with cancer may improve their care. Prior predictive models either use data with limited availability or predict the outcome of only 1 type of cancer. OBJECTIVE To investigate whether natural language processing can predict survival of patients with general cancer from a patient's initial oncologist consultation document. DESIGN, SETTING, AND PARTICIPANTS This retrospective prognostic study used data from 47 625 of 59 800 patients who started cancer care at any of the 6 BC Cancer sites located in the province of British Columbia between April 1, 2011, and December 31, 2016. Mortality data were updated until April 6, 2022, and data were analyzed from update until September 30, 2022. All patients with a medical or radiation oncologist consultation document generated within 180 days of diagnosis were included; patients seen for multiple cancers were excluded. EXPOSURES Initial oncologist consultation documents were analyzed using traditional and neural language models. MAIN OUTCOMES AND MEASURES The primary outcome was the performance of the predictive models, including balanced accuracy and receiver operating characteristics area under the curve (AUC). The secondary outcome was investigating what words the models used. RESULTS Of the 47 625 patients in the sample, 25 428 (53.4%) were female and 22 197 (46.6%) were male, with a mean (SD) age of 64.9 (13.7) years. A total of 41 447 patients (87.0%) survived 6 months, 31 143 (65.4%) survived 36 months, and 27 880 (58.5%) survived 60 months, calculated from their initial oncologist consultation. The best models achieved a balanced accuracy of 0.856 (AUC, 0.928) for predicting 6-month survival, 0.842 (AUC, 0.918) for 36-month survival, and 0.837 (AUC, 0.918) for 60-month survival, on a holdout test set. Differences in what words were important for predicting 6- vs 60-month survival were found. CONCLUSIONS AND RELEVANCE These findings suggest that models performed comparably with or better than previous models predicting cancer survival and that they may be able to predict survival using readily available data without focusing on 1 cancer type.
Collapse
Affiliation(s)
- John-Jose Nunez
- BC Cancer, Vancouver, British Columbia, Canada
- Department of Computer Science, University of British Columbia, Vancouver, British Columbia, Canada
- Department of Psychiatry, University of British Columbia, Vancouver, British Columbia, Canada
| | | | - Cheryl Ho
- BC Cancer, Vancouver, British Columbia, Canada
| | - Alan T. Bates
- BC Cancer, Vancouver, British Columbia, Canada
- Department of Psychiatry, University of British Columbia, Vancouver, British Columbia, Canada
| | - Raymond T. Ng
- Department of Computer Science, University of British Columbia, Vancouver, British Columbia, Canada
| |
Collapse
|
16
|
Kotevski DP, Smee RI, Field M, Broadley K, Vajdic CM. The Utility of Oncology Information Systems for Prognostic Modelling in Head and Neck Cancer. J Med Syst 2023; 47:9. [PMID: 36640212 PMCID: PMC9840592 DOI: 10.1007/s10916-023-01907-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Accepted: 01/03/2023] [Indexed: 01/15/2023]
Abstract
Cancer centres rely on electronic information in oncology information systems (OIS) to guide patient care. We investigated the completeness and accuracy of routinely collected head and neck cancer (HNC) data sourced from an OIS for suitability in prognostic modelling and other research. Three hundred and fifty-three adults diagnosed from 2000 to 2017 with head and neck squamous cell carcinoma, treated with radiotherapy, were eligible. Thirteen clinically relevant variables in HNC prognosis were extracted from a single-centre OIS and compared to that compiled separately in a research dataset. These two datasets were compared for agreement using Cohen's kappa coefficient for categorical variables, and intraclass correlation coefficients for continuous variables. Research data was 96% complete compared to 84% for OIS data. Agreement was perfect for gender (κ = 1.000), high for age (κ = 0.993), site (κ = 0.992), T (κ = 0.851) and N (κ = 0.812) stage, radiotherapy dose (κ = 0.889), fractions (κ = 0.856), and duration (κ = 0.818), and chemotherapy treatment (κ = 0.871), substantial for overall stage (κ = 0.791) and vital status (κ = 0.689), moderate for grade (κ = 0.547), and poor for performance status (κ = 0.110). Thirty-one other variables were poorly captured and could not be statistically compared. Documentation of clinical information within the OIS for HNC patients is routine practice; however, OIS data was less correct and complete than data collected for research purposes. Substandard collection of routine data may hinder advancements in patient care. Improved data entry, integration with clinical activities and workflows, system usability, data dictionaries, and training are necessary for OIS data to generate robust research. Data mining from clinical documents may supplement structured data collection.
Collapse
Affiliation(s)
- Damian P Kotevski
- Department of Radiation Oncology, Prince of Wales Hospital, Level 1, Bright Building, Barker St, Randwick, NSW, 2031, Australia.
- Prince of Wales Clinical School, Faculty of Medicine, University of New South Wales, Kensington, NSW, Australia.
| | - Robert I Smee
- Department of Radiation Oncology, Prince of Wales Hospital, Level 1, Bright Building, Barker St, Randwick, NSW, 2031, Australia
- Prince of Wales Clinical School, Faculty of Medicine, University of New South Wales, Kensington, NSW, Australia
- Department of Radiation Oncology, Tamworth Base Hospital, Tamworth, NSW, Australia
| | - Matthew Field
- South Western Sydney Clinical School, Faculty of Medicine, University of New South Wales, Kensington, NSW, Australia
- Ingham Institute for Applied Medical Research, Liverpool, NSW, Australia
| | - Kathryn Broadley
- Cancer and Haematology Services, Prince of Wales Hospital, Randwick, NSW, Australia
| | - Claire M Vajdic
- Centre for Big Data Research in Health, Faculty of Medicine, University of New South Wales, Kensington, NSW, Australia
- Kirby Institute, Faculty of Medicine, University of New South Wales, Kensington, NSW, Australia
| |
Collapse
|
17
|
Kotevski DP, Smee RI, Vajdic CM, Field M. Machine Learning and Nomogram Prognostic Modeling for 2-Year Head and Neck Cancer-Specific Survival Using Electronic Health Record Data: A Multisite Study. JCO Clin Cancer Inform 2023; 7:e2200128. [PMID: 36596211 DOI: 10.1200/cci.22.00128] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
PURPOSE There is limited knowledge of the prediction of 2-year cancer-specific survival (CSS) in the head and neck cancer (HNC) population. The aim of this study is to develop and validate machine learning models and a nomogram for the prediction of 2-year CSS in patients with HNC using real-world data collected by major teaching and tertiary referral hospitals in New South Wales (NSW), Australia. MATERIALS AND METHODS Data collected in oncology information systems at multiple NSW Cancer Centres were extracted for 2,953 eligible adults diagnosed between 2000 and 2017 with squamous cell carcinoma of the head and neck. Death data were sourced from the National Death Index using record linkage. Machine learning and Cox regression/nomogram models were developed and internally validated in Python and R, respectively. RESULTS Machine learning models demonstrated highest performance (C-index) in the larynx and nasopharynx cohorts (0.82), followed by the oropharynx (0.79) and the hypopharynx and oral cavity cohorts (0.73). In the whole HNC population, C-indexes of 0.79 and 0.70 and Brier scores of 0.10 and 0.27 were reported for the machine learning and nomogram model, respectively. Cox regression analysis identified age, T and N classification, and time-corrected biologic equivalent dose in two gray fractions as independent prognostic factors for 2-year CSS. N classification was the most important feature used for prediction in the machine learning model followed by age. CONCLUSION Machine learning and nomogram analysis predicted 2-year CSS with high performance using routinely collected and complete clinical information extracted from oncology information systems. These models function as visual decision-making tools to guide radiotherapy treatment decisions and provide insight into the prediction of survival outcomes in patients with HNC.
Collapse
Affiliation(s)
- Damian P Kotevski
- Department of Radiation Oncology, Prince of Wales Hospital and Community Health Services, Sydney, New South Wales, Australia.,Prince of Wales Clinical School, Faculty of Medicine, University of New South Wales, Sydney, New South Wales, Australia
| | - Robert I Smee
- Department of Radiation Oncology, Prince of Wales Hospital and Community Health Services, Sydney, New South Wales, Australia.,Prince of Wales Clinical School, Faculty of Medicine, University of New South Wales, Sydney, New South Wales, Australia.,Department of Radiation Oncology, Tamworth Base Hospital, Tamworth, New South Wales, Australia
| | - Claire M Vajdic
- Kirby Institute, Faculty of Medicine, University of New South Wales, Sydney, New South Wales, Australia
| | - Matthew Field
- South Western Sydney Clinical Campus, School of Clinical Medicine, University of New South Wales, Sydney, New South Wales, Australia.,South Western Sydney Cancer Services, NSW Health, Sydney, Sydney, New South Wales, Australia.,Ingham Institute for Applied Medical Research, Sydney, New South Wales, Australia
| |
Collapse
|
18
|
Fang C, Markuzon N, Patel N, Rueda JD. Natural Language Processing for Automated Classification of Qualitative Data From Interviews of Patients With Cancer. VALUE IN HEALTH : THE JOURNAL OF THE INTERNATIONAL SOCIETY FOR PHARMACOECONOMICS AND OUTCOMES RESEARCH 2022; 25:1995-2002. [PMID: 35840523 DOI: 10.1016/j.jval.2022.06.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Revised: 05/19/2022] [Accepted: 06/12/2022] [Indexed: 06/15/2023]
Abstract
OBJECTIVES This study sought to explore the use of novel natural language processing (NLP) methods for classifying unstructured, qualitative textual data from interviews of patients with cancer to identify patient-reported symptoms and impacts on quality of life. METHODS We tested the ability of 4 NLP models to accurately classify text from interview transcripts as "symptom," "quality of life impact," and "other." Interview data sets from patients with hepatocellular carcinoma (HCC) (n = 25), biliary tract cancer (BTC) (n = 23), and gastric cancer (n = 24) were used. Models were cross-validated with transcript subsets designated for training, validation, and testing. Multiclass classification performance of the 4 models was evaluated at paragraph and sentence level using the HCC testing data set and analyzed by the one-versus-rest technique quantified by the receiver operating characteristic area under the curve (ROC AUC) score. RESULTS NLP models accurately classified multiclass text from patient interviews. The Bidirectional Encoder Representations from Transformers model generally outperformed all other models at paragraph and sentence level. The highest predictive performance of the Bidirectional Encoder Representations from Transformers model was observed using the HCC data set to train and BTC data set to test (mean ROC AUC, 0.940 [SD 0.028]), with similarly high predictive performance using balanced and imbalanced training data sets from BTC and gastric cancer populations. CONCLUSIONS NLP models were accurate in predicting multiclass classification of text from interviews of patients with cancer, with most surpassing 0.9 ROC AUC at paragraph level. NLP may be a useful tool for scaling up processing of patient interviews in clinical studies and, thus, could serve to facilitate patient input into drug development and improving patient care.
Collapse
Affiliation(s)
- Chao Fang
- Oncology Biometrics ML/AI, AstraZeneca, Waltham, MA, USA
| | | | - Nikunj Patel
- US Medical Affairs, AstraZeneca, Gaithersburg, MD, USA
| | - Juan-David Rueda
- Oncology Market Access and Pricing, AstraZeneca, Gaithersburg, MD, USA
| |
Collapse
|
19
|
Wang L, Fu S, Wen A, Ruan X, He H, Liu S, Moon S, Mai M, Riaz IB, Wang N, Yang P, Xu H, Warner JL, Liu H. Assessment of Electronic Health Record for Cancer Research and Patient Care Through a Scoping Review of Cancer Natural Language Processing. JCO Clin Cancer Inform 2022; 6:e2200006. [PMID: 35917480 PMCID: PMC9470142 DOI: 10.1200/cci.22.00006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2022] [Revised: 03/18/2022] [Accepted: 06/15/2022] [Indexed: 11/20/2022] Open
Abstract
PURPOSE The advancement of natural language processing (NLP) has promoted the use of detailed textual data in electronic health records (EHRs) to support cancer research and to facilitate patient care. In this review, we aim to assess EHR for cancer research and patient care by using the Minimal Common Oncology Data Elements (mCODE), which is a community-driven effort to define a minimal set of data elements for cancer research and practice. Specifically, we aim to assess the alignment of NLP-extracted data elements with mCODE and review existing NLP methodologies for extracting said data elements. METHODS Published literature studies were searched to retrieve cancer-related NLP articles that were written in English and published between January 2010 and September 2020 from main literature databases. After the retrieval, articles with EHRs as the data source were manually identified. A charting form was developed for relevant study analysis and used to categorize data including four main topics: metadata, EHR data and targeted cancer types, NLP methodology, and oncology data elements and standards. RESULTS A total of 123 publications were selected finally and included in our analysis. We found that cancer research and patient care require some data elements beyond mCODE as expected. Transparency and reproductivity are not sufficient in NLP methods, and inconsistency in NLP evaluation exists. CONCLUSION We conducted a comprehensive review of cancer NLP for research and patient care using EHRs data. Issues and barriers for wide adoption of cancer NLP were identified and discussed.
Collapse
Affiliation(s)
- Liwei Wang
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, MN
| | - Sunyang Fu
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, MN
| | - Andrew Wen
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, MN
| | - Xiaoyang Ruan
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, MN
| | - Huan He
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, MN
| | - Sijia Liu
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, MN
| | - Sungrim Moon
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, MN
| | - Michelle Mai
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, MN
| | - Irbaz B. Riaz
- Department of Hematology/Oncology, Mayo Clinic, Scottsdale, AZ
| | - Nan Wang
- Department of Computer Science and Engineering, College of Science and Engineering, University of Minnesota, Minneapolis, MN
| | - Ping Yang
- Department of Quantitative Health Sciences, Mayo Clinic, Scottsdale, AZ
| | - Hua Xu
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX
| | - Jeremy L. Warner
- Departments of Medicine (Hematology/Oncology), Vanderbilt University, Nashville, TN
- Department Biomedical Informatics, Vanderbilt University, Nashville, TN
| | - Hongfang Liu
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, MN
| |
Collapse
|
20
|
Lindvall C, Deng CY, Agaronnik ND, Kwok A, Samineni S, Umeton R, Mackie-Jenkins W, Kehl KL, Tulsky JA, Enzinger AC. Deep Learning for Cancer Symptoms Monitoring on the Basis of Electronic Health Record Unstructured Clinical Notes. JCO Clin Cancer Inform 2022; 6:e2100136. [PMID: 35714301 PMCID: PMC9232368 DOI: 10.1200/cci.21.00136] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
PURPOSE Symptoms are vital outcomes for cancer clinical trials, observational research, and population-level surveillance. Patient-reported outcomes (PROs) are valuable for monitoring symptoms, yet there are many challenges to collecting PROs at scale. We sought to develop, test, and externally validate a deep learning model to extract symptoms from unstructured clinical notes in the electronic health record. METHODS We randomly selected 1,225 outpatient progress notes from among patients treated at the Dana-Farber Cancer Institute between January 2016 and December 2019 and used 1,125 notes as our training/validation data set and 100 notes as our test data set. We evaluated the performance of 10 deep learning models for detecting 80 symptoms included in the National Cancer Institute's Patient-Reported Outcomes version of the Common Terminology Criteria for Adverse Events (PRO-CTCAE) framework. Model performance as compared with manual chart abstraction was assessed using standard metrics, and the highest performer was externally validated on a sample of 100 physician notes from a different clinical context. RESULTS In our training and test data sets, 75 of the 80 candidate symptoms were identified. The ELECTRA-small model had the highest performance for symptom identification at the token level (ie, at the individual symptom level), with an F1 of 0.87 and a processing time of 3.95 seconds per note. For the 10 most common symptoms in the test data set, the F1 score ranged from 0.98 for anxious to 0.86 for fatigue. For external validation of the same symptoms, the note-level performance ranged from F1 = 0.97 for diarrhea and dizziness to F1 = 0.73 for swelling. CONCLUSION Training a deep learning model to identify a wide range of electronic health record-documented symptoms relevant to cancer care is feasible. This approach could be used at the health system scale to complement to electronic PROs.
Collapse
Affiliation(s)
- Charlotta Lindvall
- Dana-Farber Cancer Institute, Boston, MA.,Harvard Medical School, Boston, MA.,Brigham and Women's Hospital, Boston, MA
| | | | - Nicole D Agaronnik
- Dana-Farber Cancer Institute, Boston, MA.,Harvard Medical School, Boston, MA
| | - Anne Kwok
- Dana-Farber Cancer Institute, Boston, MA
| | | | | | | | - Kenneth L Kehl
- Dana-Farber Cancer Institute, Boston, MA.,Harvard Medical School, Boston, MA.,Brigham and Women's Hospital, Boston, MA
| | - James A Tulsky
- Dana-Farber Cancer Institute, Boston, MA.,Harvard Medical School, Boston, MA.,Brigham and Women's Hospital, Boston, MA
| | - Andrea C Enzinger
- Dana-Farber Cancer Institute, Boston, MA.,Harvard Medical School, Boston, MA.,Brigham and Women's Hospital, Boston, MA
| |
Collapse
|
21
|
Shreve JT, Khanani SA, Haddad TC. Artificial Intelligence in Oncology: Current Capabilities, Future Opportunities, and Ethical Considerations. Am Soc Clin Oncol Educ Book 2022; 42:1-10. [PMID: 35687826 DOI: 10.1200/edbk_350652] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
The promise of highly personalized oncology care using artificial intelligence (AI) technologies has been forecasted since the emergence of the field. Cumulative advances across the science are bringing this promise to realization, including refinement of machine learning- and deep learning algorithms; expansion in the depth and variety of databases, including multiomics; and the decreased cost of massively parallelized computational power. Examples of successful clinical applications of AI can be found throughout the cancer continuum and in multidisciplinary practice, with computer vision-assisted image analysis in particular having several U.S. Food and Drug Administration-approved uses. Techniques with emerging clinical utility include whole blood multicancer detection from deep sequencing, virtual biopsies, natural language processing to infer health trajectories from medical notes, and advanced clinical decision support systems that combine genomics and clinomics. Substantial issues have delayed broad adoption, with data transparency and interpretability suffering from AI's "black box" mechanism, and intrinsic bias against underrepresented persons limiting the reproducibility of AI models and perpetuating health care disparities. Midfuture projections of AI maturation involve increasing a model's complexity by using multimodal data elements to better approximate an organic system. Far-future positing includes living databases that accumulate all aspects of a person's health into discrete data elements; this will fuel highly convoluted modeling that can tailor treatment selection, dose determination, surveillance modality and schedule, and more. The field of AI has had a historical dichotomy between its proponents and detractors. The successful development of recent applications, and continued investment in prospective validation that defines their impact on multilevel outcomes, has established a momentum of accelerated progress.
Collapse
Affiliation(s)
| | | | - Tufia C Haddad
- Department of Oncology, Mayo Clinic, Rochester, MN.,Center for Digital Health, Mayo Clinic, Rochester, MN
| |
Collapse
|
22
|
Kehl KL, Xu W, Gusev A, Bakouny Z, Choueiri TK, Riaz IB, Elmarakeby H, Van Allen EM, Schrag D. Artificial intelligence-aided clinical annotation of a large multi-cancer genomic dataset. Nat Commun 2021; 12:7304. [PMID: 34911934 PMCID: PMC8674229 DOI: 10.1038/s41467-021-27358-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Accepted: 11/16/2021] [Indexed: 02/08/2023] Open
Abstract
To accelerate cancer research that correlates biomarkers with clinical endpoints, methods are needed to ascertain outcomes from electronic health records at scale. Here, we train deep natural language processing (NLP) models to extract outcomes for participants with any of 7 solid tumors in a precision oncology study. Outcomes are extracted from 305,151 imaging reports for 13,130 patients and 233,517 oncologist notes for 13,511 patients, including patients with 6 additional cancer types. NLP models recapitulate outcome annotation from these documents, including the presence of cancer, progression/worsening, response/improvement, and metastases, with excellent discrimination (AUROC > 0.90). Models generalize to cancers excluded from training and yield outcomes correlated with survival. Among patients receiving checkpoint inhibitors, we confirm that high tumor mutation burden is associated with superior progression-free survival ascertained using NLP. Here, we show that deep NLP can accelerate annotation of molecular cancer datasets with clinically meaningful endpoints to facilitate discovery. To accelerate cancer research that correlates biomarkers with clinical endpoints, methods are needed to ascertain outcomes from electronic health records at scale. Here, the authors train natural language processing to extract outcomes for participants in a precision oncology study.
Collapse
Affiliation(s)
- Kenneth L Kehl
- From Dana-Farber Cancer Institute, Boston, MA, USA. .,Brigham and Women's Hospital, Boston, MA, USA. .,Harvard Medical School, Boston, MA, USA.
| | - Wenxin Xu
- From Dana-Farber Cancer Institute, Boston, MA, USA.,Brigham and Women's Hospital, Boston, MA, USA.,Harvard Medical School, Boston, MA, USA
| | - Alexander Gusev
- From Dana-Farber Cancer Institute, Boston, MA, USA.,Brigham and Women's Hospital, Boston, MA, USA.,Harvard Medical School, Boston, MA, USA
| | - Ziad Bakouny
- From Dana-Farber Cancer Institute, Boston, MA, USA.,Brigham and Women's Hospital, Boston, MA, USA.,Harvard Medical School, Boston, MA, USA
| | - Toni K Choueiri
- From Dana-Farber Cancer Institute, Boston, MA, USA.,Brigham and Women's Hospital, Boston, MA, USA.,Harvard Medical School, Boston, MA, USA
| | | | - Haitham Elmarakeby
- From Dana-Farber Cancer Institute, Boston, MA, USA.,Brigham and Women's Hospital, Boston, MA, USA.,Harvard Medical School, Boston, MA, USA.,The Broad Institute, Rochester, USA
| | - Eliezer M Van Allen
- From Dana-Farber Cancer Institute, Boston, MA, USA.,Brigham and Women's Hospital, Boston, MA, USA.,Harvard Medical School, Boston, MA, USA.,The Broad Institute, Rochester, USA
| | | |
Collapse
|
23
|
Ma C, Sridharan M, Al-Sayegh H, Li A, Guo D, Auclair M, Kuragayala V, Bandaru C, Milne D, Cruse H, Beaudoin R, Orechia J, Bickel J, London WB. Building a Harmonized Datamart by Integrating Cross-Institutional Systems of Clinical, Outcome, and Genomic Data: The Pediatric Patient Informatics Platform ( PPIP). JCO Clin Cancer Inform 2021; 5:202-215. [PMID: 33591797 DOI: 10.1200/cci.20.00083] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
PURPOSE Siloed electronic medical data limits utility and accessibility. At the Dana-Farber/Boston Children's Cancer and Blood Disorders Center, cross-institutional data were inconsistent and difficult to access. To unify data for clinical operations, administration, and research, we developed the Pediatric Patient Informatics Platform (PPIP), an integrated datamart harmonizing multiple source systems across two institutions into a common technology. PATIENTS AND METHODS Starting in 2009, user requirements were gathered and data sources were prioritized. Project teams, including biostatisticians, database developers, and an external contractor, were formed. Read-access to source systems was established. The 3-layer PPIP architecture was developed: STAGING, a near-exact copy of source data; INTEGRATION, where data were reorganized into domains; and, CONSUMPTION, where data were optimized for rapid retrieval. The diverse systems were integrated into a common IBM Netezza technology. Data filters were defined to accurately capture the Center's patients, and derived data items were created for harmonization across sources. An interactive online query tool, PPIP360, was developed using Microstrategy Analytics. RESULTS Driven by scientific objectives, the PPIP datamart was created, including 33,674 patients, 2,983 protocols, and 3.6 million patient visits from 14 source databases, 164 source tables, and 2,622 source data items. The PPIP360 has 605 data items and 33 metrics across 11 reports and dashboards. Dana-Farber and Boston Children's established a legal data-sharing agreement. The PPIP has supported hundreds of faculty, staff, and projects, including planning clinical trials and informing strategic planning. CONCLUSION The PPIP has successfully harmonized and integrated diagnostic, demographic, laboratory, treatment, clinical outcome, pathology, transplant, meta-protocol, and -omics data, for efficient, daily operational and research activities at Dana-Farber/Boston Children's Cancer and Blood Disorders Center, and future external sharing.
Collapse
Affiliation(s)
- Clement Ma
- Dana-Farber/Boston Children's Cancer and Blood Disorders Center, Boston, MA.,Harvard Medical School, Boston, MA
| | | | - Hasan Al-Sayegh
- Dana-Farber/Boston Children's Cancer and Blood Disorders Center, Boston, MA
| | - Anran Li
- Dana-Farber/Boston Children's Cancer and Blood Disorders Center, Boston, MA
| | - Dongjing Guo
- Dana-Farber/Boston Children's Cancer and Blood Disorders Center, Boston, MA
| | | | | | | | - Dana Milne
- Dana-Farber Cancer Institute, Boston, MA
| | | | | | | | - Jonathan Bickel
- Harvard Medical School, Boston, MA.,Boston Children's Hospital, Boston, MA
| | - Wendy B London
- Dana-Farber/Boston Children's Cancer and Blood Disorders Center, Boston, MA.,Harvard Medical School, Boston, MA
| |
Collapse
|
24
|
Alkaitis MS, Agrawal MN, Riely GJ, Razavi P, Sontag D. Automated NLP Extraction of Clinical Rationale for Treatment Discontinuation in Breast Cancer. JCO Clin Cancer Inform 2021; 5:550-560. [PMID: 33989016 DOI: 10.1200/cci.20.00139] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
PURPOSE Key oncology end points are not routinely encoded into electronic medical records (EMRs). We assessed whether natural language processing (NLP) can abstract treatment discontinuation rationale from unstructured EMR notes to estimate toxicity incidence and progression-free survival (PFS). METHODS We constructed a retrospective cohort of 6,115 patients with early-stage and 701 patients with metastatic breast cancer initiating care at Memorial Sloan Kettering Cancer Center from 2008 to 2019. Each cohort was divided into training (70%), validation (15%), and test (15%) subsets. Human abstractors identified the clinical rationale associated with treatment discontinuation events. Concatenated EMR notes were used to train high-dimensional logistic regression and convolutional neural network models. Kaplan-Meier analyses were used to compare toxicity incidence and PFS estimated by our NLP models to estimates generated by manual labeling and time-to-treatment discontinuation (TTD). RESULTS Our best high-dimensional logistic regression models identified toxicity events in early-stage patients with an area under the curve of the receiver-operator characteristic of 0.857 ± 0.014 (standard deviation) and progression events in metastatic patients with an area under the curve of 0.752 ± 0.027 (standard deviation). NLP-extracted toxicity incidence and PFS curves were not significantly different from manually extracted curves (P = .95 and P = .67, respectively). By contrast, TTD overestimated toxicity in early-stage patients (P < .001) and underestimated PFS in metastatic patients (P < .001). Additionally, we tested an extrapolation approach in which 20% of the metastatic cohort were labeled manually, and NLP algorithms were used to abstract the remaining 80%. This extrapolated outcomes approach resolved PFS differences between receptor subtypes (P < .001 for hormone receptor+/human epidermal growth factor receptor 2- v human epidermal growth factor receptor 2+ v triple-negative) that could not be resolved with TTD. CONCLUSION NLP models are capable of abstracting treatment discontinuation rationale with minimal manual labeling.
Collapse
Affiliation(s)
- Matthew S Alkaitis
- CSAIL & IMES, Massachusetts Institute of Technology, Cambridge, MA.,Harvard Medical School, Boston, MA
| | - Monica N Agrawal
- CSAIL & IMES, Massachusetts Institute of Technology, Cambridge, MA
| | - Gregory J Riely
- Memorial Sloan Kettering Cancer Center, New York, NY.,Weill-Cornell Medical College, New York, NY
| | - Pedram Razavi
- Memorial Sloan Kettering Cancer Center, New York, NY.,Weill-Cornell Medical College, New York, NY
| | - David Sontag
- CSAIL & IMES, Massachusetts Institute of Technology, Cambridge, MA
| |
Collapse
|
25
|
Ronquillo JG, Lester WT. Practical Aspects of Implementing and Applying Health Care Cloud Computing Services and Informatics to Cancer Clinical Trial Data. JCO Clin Cancer Inform 2021; 5:826-832. [PMID: 34383582 DOI: 10.1200/cci.21.00018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
PURPOSE Cloud computing has led to dramatic growth in the volume, variety, and velocity of cancer data. However, cloud platforms and services present new challenges for cancer research, particularly in understanding the practical tradeoffs between cloud performance, cost, and complexity. The goal of this study was to describe the practical challenges when using a cloud-based service to improve the cancer clinical trial matching process. METHODS We collected information for all interventional cancer clinical trials from ClinicalTrials.gov and used the Google Cloud Healthcare Natural Language Application Programming Interface (API) to analyze clinical trial Title and Eligibility Criteria text. An informatics pipeline leveraging interoperability standards summarized the distribution of cancer clinical trials, genes, laboratory tests, and medications extracted from cloud-based entity analysis. RESULTS There were a total of 38,851 cancer-related clinical trials found in this study, with the distribution of cancer categories extracted from Title text significantly different than in ClinicalTrials.gov (P < .001). Cloud-based entity analysis of clinical trial criteria identified a total of 949 genes, 1,782 laboratory tests, 2,086 medications, and 4,902 National Cancer Institute Thesaurus terms, with estimated detection accuracies ranging from 12.8% to 89.9%. A total of 77,702 API calls processed an estimated 167,179 text records, which took a total of 1,979 processing-minutes (33.0 processing-hours), or approximately 1.5 seconds per API call. CONCLUSION Current general-purpose cloud health care tools-like the Google service in this study-should not be used for automated clinical trial matching unless they can perform effective extraction and classification of the clinical, genetic, and medication concepts central to precision oncology research. A strong understanding of the practical aspects of cloud computing will help researchers effectively navigate the vast data ecosystems in cancer research.
Collapse
Affiliation(s)
- Jay G Ronquillo
- Center for Biomedical Informatics and Information Technology, National Cancer Institute, Rockville, MD.,Office of Data Science Strategy, National Institutes of Health, Bethesda, MD
| | - William T Lester
- Laboratory of Computer Science, Massachusetts General Hospital, Boston, MA.,Harvard Medical School, Boston, MA
| |
Collapse
|
26
|
Kehl KL, Riely GJ, Lepisto EM, Lavery JA, Warner JL, LeNoue-Newton ML, Sweeney SM, Rudolph JE, Brown S, Yu C, Bedard PL, Schrag D, Panageas KS. Correlation Between Surrogate End Points and Overall Survival in a Multi-institutional Clinicogenomic Cohort of Patients With Non-Small Cell Lung or Colorectal Cancer. JAMA Netw Open 2021; 4:e2117547. [PMID: 34309669 PMCID: PMC8314138 DOI: 10.1001/jamanetworkopen.2021.17547] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open
Abstract
IMPORTANCE Contemporary observational cancer research requires associating genomic biomarkers with reproducible end points; overall survival (OS) is a key end point, but interpretation can be challenging when multiple lines of therapy and prolonged survival are common. Progression-free survival (PFS), time to treatment discontinuation (TTD), and time to next treatment (TTNT) are alternative end points, but their utility as surrogates for OS in real-world clinicogenomic data sets has not been well characterized. OBJECTIVE To measure correlations between candidate surrogate end points and OS in a multi-institutional clinicogenomic data set. DESIGN, SETTING, AND PARTICIPANTS A retrospective cohort study was conducted of patients with non-small cell lung cancer (NSCLC) or colorectal cancer (CRC) whose tumors were genotyped at 4 academic centers from January 1, 2014, to December 31, 2017, and who initiated systemic therapy for advanced disease. Patients were followed up through August 31, 2020 (NSCLC), and October 31, 2020 (CRC). Statistical analyses were conducted on January 5, 2021. EXPOSURES Candidate surrogate end points included TTD; TTNT; PFS based on imaging reports only; PFS based on medical oncologist ascertainment only; PFS based on either imaging or medical oncologist ascertainment, whichever came first; and PFS defined by a requirement that both imaging and medical oncologist ascertainment have indicated progression. MAIN OUTCOMES AND MEASURES The primary outcome was the correlation between candidate surrogate end points and OS. RESULTS There were 1161 patients with NSCLC (648 women [55.8%]; mean [SD] age, 63 [11] years) and 1150 with CRC (647 men [56.3%]; mean [SD] age, 54 [12] years) identified for analysis. Progression-free survival based on both imaging and medical oncologist documentation was most correlated with OS (NSCLC: ρ = 0.76; 95% CI, 0.73-0.79; CRC: ρ = 0.73; 95% CI, 0.69-0.75). Time to treatment discontinuation was least associated with OS (NSCLC: ρ = 0.45; 95% CI, 0.40-0.50; CRC: ρ = 0.13; 95% CI, 0.06-0.19). Time to next treatment was modestly associated with OS (NSCLC: ρ = 0.60; 0.55-0.64; CRC: ρ = 0.39; 95% CI, 0.32-0.46). CONCLUSIONS AND RELEVANCE This cohort study suggests that PFS based on both a radiologist and a treating oncologist determining that a progression event has occurred was the surrogate end point most highly correlated with OS for analysis of observational clinicogenomic data.
Collapse
Affiliation(s)
- Kenneth L. Kehl
- Department of Medical Oncology, Division of Population Sciences, Dana-Farber Cancer Institute, Harvard Medical School, Boston, Massachusetts
| | - Gregory J. Riely
- Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Eva M. Lepisto
- Department of Medical Oncology, Division of Population Sciences, Dana-Farber Cancer Institute, Harvard Medical School, Boston, Massachusetts
| | - Jessica A. Lavery
- Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Jeremy L. Warner
- Department of Medicine, Division of Hematology/Oncology, Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, Tennessee
- Department of Biomedical Informatics, Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, Tennessee
| | | | - Shawn M. Sweeney
- American Association for Cancer Research, Philadelphia, Pennsylvania
| | - Julia E. Rudolph
- Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Samantha Brown
- Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Celeste Yu
- Division of Medical Oncology & Hematology, Princess Margaret Cancer Centre/University Health Network, Toronto, Ontario, Canada
| | - Philippe L. Bedard
- Division of Medical Oncology & Hematology, Princess Margaret Cancer Centre/University Health Network, Toronto, Ontario, Canada
- Department of Medicine, University of Toronto, Toronto, Ontario, Canada
| | - Deborah Schrag
- Department of Medical Oncology, Division of Population Sciences, Dana-Farber Cancer Institute, Harvard Medical School, Boston, Massachusetts
- Associate Editor, JAMA
| | - Katherine S. Panageas
- Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, New York
| |
Collapse
|
27
|
Saini KS, Twelves C. Determining lines of therapy in patients with solid cancers: a proposed new systematic and comprehensive framework. Br J Cancer 2021; 125:155-163. [PMID: 33850304 PMCID: PMC8292475 DOI: 10.1038/s41416-021-01319-8] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2020] [Revised: 01/25/2021] [Accepted: 02/10/2021] [Indexed: 12/18/2022] Open
Abstract
The complexity of neoplasia and its treatment are a challenge to the formulation of general criteria that are applicable across solid cancers. Determining the number of prior lines of therapy (LoT) is critically important for optimising future treatment, conducting medication audits, and assessing eligibility for clinical trial enrolment. Currently, however, no accepted set of criteria or definitions exists to enumerate LoT. In this article, we seek to open a dialogue to address this challenge by proposing a systematic and comprehensive framework to determine LoT uniformly across solid malignancies. First, key terms, including LoT and 'clinical progression of disease' are defined. Next, we clarify which therapies should be assigned a LoT, and why. Finally, we propose reporting LoT in a novel and standardised format as LoT N (CLoT + PLoT), where CLoT is the number of systemic anti-cancer therapies (SACT) administered with curative intent and/or in the early setting, PLoT is the number of SACT given with palliative intent and/or in the advanced setting, and N is the sum of CLoT and PLoT. As a next step, the cancer research community should develop and adopt standardised guidelines for enumerating LoT in a uniform manner.
Collapse
Affiliation(s)
- Kamal S Saini
- Covance Inc., Princeton, NJ, USA.
- East Suffolk and North Essex NHS Foundation Trust, Ipswich, UK.
| | - Chris Twelves
- University of Leeds and Leeds Teaching Hospitals Trust, Leeds, UK.
| |
Collapse
|
28
|
Momeni-Boroujeni A, Dahoud W, Vanderbilt CM, Chiang S, Murali R, Rios-Doria EV, Alektiar KM, Aghajanian C, Abu-Rustum NR, Ladanyi M, Ellenson LH, Weigelt B, Soslow RA. Clinicopathologic and Genomic Analysis of TP53-Mutated Endometrial Carcinomas. Clin Cancer Res 2021; 27:2613-2623. [PMID: 33602681 PMCID: PMC8530276 DOI: 10.1158/1078-0432.ccr-20-4436] [Citation(s) in RCA: 41] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2020] [Revised: 01/16/2021] [Accepted: 02/11/2021] [Indexed: 11/16/2022]
Abstract
PURPOSE Copy number-high endometrial carcinomas were described by The Cancer Genome Atlas as high-grade endometrioid and serous cancers showing frequent copy-number alterations (CNA), low mutational burden (i.e., non-hypermutant), near-universal TP53 mutation, and unfavorable clinical outcomes. We sought to investigate and compare the clinicopathologic and molecular characteristics of non-hypermutant TP53-altered endometrial carcinomas of four histologic types. EXPERIMENTAL DESIGN TP53-mutated endometrial carcinomas, defined as TP53-mutant tumors lacking microsatellite instability or pathogenic POLE mutations, were identified (n = 238) in a cohort of 1,239 endometrial carcinomas subjected to clinical massively parallel sequencing of 410-468 cancer-related genes. Somatic mutations and CNAs (n = 238), and clinicopathologic features were determined (n = 185, initial treatment planning at our institution). RESULTS TP53-mutated endometrial carcinomas encompassed uterine serous (n = 102, 55.1%), high-grade endometrial carcinoma with ambiguous features/not otherwise specified (EC-NOS; n = 44, 23.8%), endometrioid carcinomas of all tumor grades (n = 28, 15.1%), and clear cell carcinomas (n = 11, 5.9%). PTEN mutations were significantly more frequent in endometrioid carcinomas, SPOP mutations in clear cell carcinomas, and CCNE1 amplification in serous carcinomas/EC-NOS; however, none of these genomic alterations were exclusive to any given histologic type. ERBB2 amplification was present at similar frequencies across TP53-mutated histologic types (7.7%-18.6%). Although overall survival was similar across histologic types, serous carcinomas presented more frequently at stage IV, had more persistent and/or recurrent disease, and reduced disease-free survival. CONCLUSIONS TP53-mutated endometrial carcinomas display clinical and molecular similarities across histologic subtypes. Our data provide evidence to suggest performance of ERBB2 assessment in all TP53-mutated endometrial carcinomas. Given the distinct clinical features of serous carcinomas, histologic classification continues to be relevant.
Collapse
Affiliation(s)
| | - Wissam Dahoud
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Chad M Vanderbilt
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Sarah Chiang
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Rajmohan Murali
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Eric V Rios-Doria
- Gynecology Service, Department of Surgery, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Kaled M Alektiar
- Department of Radiation Oncology, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Carol Aghajanian
- Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Nadeem R Abu-Rustum
- Gynecology Service, Department of Surgery, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Marc Ladanyi
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Lora H Ellenson
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Britta Weigelt
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, New York.
| | - Robert A Soslow
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, New York.
| |
Collapse
|
29
|
Evolution of Hematology Clinical Trial Adverse Event Reporting to Improve Care Delivery. Curr Hematol Malig Rep 2021; 16:126-131. [PMID: 33786724 DOI: 10.1007/s11899-021-00627-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/21/2021] [Indexed: 10/21/2022]
Abstract
PURPOSE OF REVIEW Reporting of adverse events on hematology clinical trials is crucial to understanding the safety of standard treatments and novel agents. However, despite the importance of understanding toxicities, challenges in capturing and reporting accurate adverse event data exist. RECENT FINDINGS Currently, adverse events are reported manually on most hematology clinical trials. Especially on phase III trials, the highest grade of each adverse event during a reporting period is typically reported. Despite the effort committed to AE reporting, studies have identified underreporting of adverse events on hematologic malignancy clinical trials, which raises concern about the true understanding of safety of treatment that clinicians have in order to guide patients about what to expect during therapy. In order to address these concerns, recent studies have piloted alternative methods for identification of adverse events. These methods include automated extraction of adverse event data from the electronic health record, implementation of trigger or alert tools into the medical record, and analytic tools to evaluate duration of adverse events rather than only the highest adverse event grade. Adverse event reporting is a crucial component of clinical trials. Novel tools for identifying and reporting adverse events provide opportunities for honing and refining methods of toxicity capture and improving understanding of toxicities patients experience while enrolled on clinical trials.
Collapse
|
30
|
Sorin V, Barash Y, Konen E, Klang E. Deep-learning natural language processing for oncological applications. Lancet Oncol 2020; 21:1553-1556. [DOI: 10.1016/s1470-2045(20)30615-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Accepted: 10/05/2020] [Indexed: 10/22/2022]
|