Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Rios A, Kavuluru R. Neural transfer learning for assigning diagnosis codes to EMRs. Artif Intell Med 2019;96:116-122. [PMID: 31164204 DOI: 10.1016/j.artmed.2019.04.002] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2018] [Revised: 12/20/2018] [Accepted: 04/10/2019] [Indexed: 11/25/2022]

For:	Rios A, Kavuluru R. Neural transfer learning for assigning diagnosis codes to EMRs. Artif Intell Med 2019;96:116-122. [PMID: 31164204 DOI: 10.1016/j.artmed.2019.04.002] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2018] [Revised: 12/20/2018] [Accepted: 04/10/2019] [Indexed: 11/25/2022]

Number

Cited by Other Article(s)

Li Y, Yang Q, Wang FL, Lee LK, Qu Y, Hao T. Asymmetric cross-modal attention network with multimodal augmented mixup for medical visual question answering. Artif Intell Med 2023;144:102667. [PMID: 37783542 DOI: 10.1016/j.artmed.2023.102667] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Revised: 07/24/2023] [Accepted: 09/14/2023] [Indexed: 10/04/2023]

Kufel J, Bargieł-Łączek K, Kocot S, Koźlik M, Bartnikowska W, Janik M, Czogalik Ł, Dudek P, Magiera M, Lis A, Paszkiewicz I, Nawrat Z, Cebula M, Gruszczyńska K. What Is Machine Learning, Artificial Neural Networks and Deep Learning?-Examples of Practical Applications in Medicine. Diagnostics (Basel) 2023;13:2582. [PMID: 37568945 PMCID: PMC10417718 DOI: 10.3390/diagnostics13152582] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2023] [Revised: 07/19/2023] [Accepted: 08/01/2023] [Indexed: 08/13/2023] Open

Affiliation(s)

Jakub Kufel Department of Biophysics, Faculty of Medical Sciences in Zabrze, Medical University of Silesia, 41-808 Zabrze, Poland;
Katarzyna Bargieł-Łączek Paediatric Radiology Students’ Scientific Association at the Division of Diagnostic Imaging, Department of Radiology and Nuclear Medicine, Faculty of Medical Science in Katowice, Medical University of Silesia, 40-752 Katowice, Poland; (K.B.-Ł.); (W.B.)
Szymon Kocot Bright Coders’ Factory, Technologiczna 2, 45-839 Opole, Poland
Maciej Koźlik Division of Cardiology and Structural Heart Disease, Medical University of Silesia, 40-635 Katowice, Poland;
Wiktoria Bartnikowska Paediatric Radiology Students’ Scientific Association at the Division of Diagnostic Imaging, Department of Radiology and Nuclear Medicine, Faculty of Medical Science in Katowice, Medical University of Silesia, 40-752 Katowice, Poland; (K.B.-Ł.); (W.B.)
Michał Janik Student Scientific Association Named after Professor Zbigniew Religa at the Department of Biophysics, Faculty of Medical Sciences in Zabrze, Medical University of Silesia, Jordana 19, 41-808 Zabrze, Poland; (M.J.); (Ł.C.); (P.D.); (M.M.); (I.P.)
Łukasz Czogalik Student Scientific Association Named after Professor Zbigniew Religa at the Department of Biophysics, Faculty of Medical Sciences in Zabrze, Medical University of Silesia, Jordana 19, 41-808 Zabrze, Poland; (M.J.); (Ł.C.); (P.D.); (M.M.); (I.P.)
Piotr Dudek Student Scientific Association Named after Professor Zbigniew Religa at the Department of Biophysics, Faculty of Medical Sciences in Zabrze, Medical University of Silesia, Jordana 19, 41-808 Zabrze, Poland; (M.J.); (Ł.C.); (P.D.); (M.M.); (I.P.)
Mikołaj Magiera Student Scientific Association Named after Professor Zbigniew Religa at the Department of Biophysics, Faculty of Medical Sciences in Zabrze, Medical University of Silesia, Jordana 19, 41-808 Zabrze, Poland; (M.J.); (Ł.C.); (P.D.); (M.M.); (I.P.)
Anna Lis Cardiology Students’ Scientific Association at the III Department of Cardiology, Faculty of Medical Sciences in Katowice, Medical University of Silesia, 40-635 Katowice, Poland;
Iga Paszkiewicz Student Scientific Association Named after Professor Zbigniew Religa at the Department of Biophysics, Faculty of Medical Sciences in Zabrze, Medical University of Silesia, Jordana 19, 41-808 Zabrze, Poland; (M.J.); (Ł.C.); (P.D.); (M.M.); (I.P.)
Zbigniew Nawrat Department of Biophysics, Faculty of Medical Sciences in Zabrze, Medical University of Silesia, 41-808 Zabrze, Poland;
Maciej Cebula Individual Specialist Medical Practice Maciej Cebula, 40-754 Katowice, Poland;
Katarzyna Gruszczyńska Department of Radiodiagnostics, Invasive Radiology and Nuclear Medicine, Department of Radiology and Nuclear Medicine, School of Medicine in Katowice, Medical University of Silesia, Medyków 14, 40-752 Katowice, Poland;

Collapse

Cai L, Li J, Lv H, Liu W, Niu H, Wang Z. Integrating domain knowledge for biomedical text analysis into deep learning: A survey. J Biomed Inform 2023;143:104418. [PMID: 37290540 DOI: 10.1016/j.jbi.2023.104418] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Revised: 04/24/2023] [Accepted: 05/31/2023] [Indexed: 06/10/2023]

Pierre K, Haneberg AG, Kwak S, Peters KR, Hochhegger B, Sananmuang T, Tunlayadechanont P, Tighe PJ, Mancuso A, Forghani R. Applications of Artificial Intelligence in the Radiology Roundtrip: Process Streamlining, Workflow Optimization, and Beyond. Semin Roentgenol 2023;58:158-169. [PMID: 37087136 DOI: 10.1053/j.ro.2023.02.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2023] [Accepted: 02/14/2023] [Indexed: 04/24/2023]

Affiliation(s)

Kevin Pierre Radiomics and Augmented Intelligence Laboratory (RAIL), Department of Radiology and the Norman Fixel Institute for Neurological Diseases, University of Florida College of Medicine, Gainesville, FL; Department of Radiology, University of Florida College of Medicine, Gainesville, FL
Adam G Haneberg Radiomics and Augmented Intelligence Laboratory (RAIL), Department of Radiology and the Norman Fixel Institute for Neurological Diseases, University of Florida College of Medicine, Gainesville, FL; Division of Medical Physics, Department of Radiology, University of Florida College of Medicine, Gainesville, FL
Sean Kwak Radiomics and Augmented Intelligence Laboratory (RAIL), Department of Radiology and the Norman Fixel Institute for Neurological Diseases, University of Florida College of Medicine, Gainesville, FL
Keith R Peters Radiomics and Augmented Intelligence Laboratory (RAIL), Department of Radiology and the Norman Fixel Institute for Neurological Diseases, University of Florida College of Medicine, Gainesville, FL; Department of Radiology, University of Florida College of Medicine, Gainesville, FL
Bruno Hochhegger Radiomics and Augmented Intelligence Laboratory (RAIL), Department of Radiology and the Norman Fixel Institute for Neurological Diseases, University of Florida College of Medicine, Gainesville, FL; Department of Radiology, University of Florida College of Medicine, Gainesville, FL
Thiparom Sananmuang Department of Diagnostic and Therapeutic Radiology and Research, Faculty of Medicine Ramathibodi Hospital, Ratchathewi, Bangkok, Thailand
Padcha Tunlayadechanont Department of Diagnostic and Therapeutic Radiology and Research, Faculty of Medicine Ramathibodi Hospital, Ratchathewi, Bangkok, Thailand
Patrick J Tighe Departments of Anesthesiology & Orthopaedic Surgery, University of Florida College of Medicine, Gainesville, FL
Anthony Mancuso Radiomics and Augmented Intelligence Laboratory (RAIL), Department of Radiology and the Norman Fixel Institute for Neurological Diseases, University of Florida College of Medicine, Gainesville, FL; Department of Radiology, University of Florida College of Medicine, Gainesville, FL
Reza Forghani Radiomics and Augmented Intelligence Laboratory (RAIL), Department of Radiology and the Norman Fixel Institute for Neurological Diseases, University of Florida College of Medicine, Gainesville, FL; Department of Radiology, University of Florida College of Medicine, Gainesville, FL; Division of Medical Physics, Department of Radiology, University of Florida College of Medicine, Gainesville, FL.

Collapse

Amini S, Hao B, Zhang L, Song M, Gupta A, Karjadi C, Kolachalama VB, Au R, Paschalidis IC. Automated detection of mild cognitive impairment and dementia from voice recordings: A natural language processing approach. Alzheimers Dement 2023;19:946-955. [PMID: 35796399 PMCID: PMC10148688 DOI: 10.1002/alz.12721] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2021] [Revised: 03/20/2022] [Accepted: 05/18/2022] [Indexed: 11/06/2022]

Ponthongmak W, Thammasudjarit R, McKay GJ, Attia J, Theera-Ampornpunt N, Thakkinstian A. Development and external validation of automated ICD-10 coding from discharge summaries using deep learning approaches. INFORMATICS IN MEDICINE UNLOCKED 2023. [DOI: 10.1016/j.imu.2023.101227] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/29/2023] Open

Ding J, Li B, Xu C, Qiao Y, Zhang L. Diagnosing crop diseases based on domain-adaptive pre-training BERT of electronic medical records. APPL INTELL 2022. [DOI: 10.1007/s10489-022-04346-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/05/2022]

Gutton J, Lin F, Billuart O, Lajonchère JP, Crubilié C, Sauvage C, Buronfosse A. [Artificial intelligence for medical information departments : construction and evaluation of a decision-making tool to identify and prioritize stays of which the PMSI coding could be optimized, and to ensure the revenues generated by activity-based pricing]. Rev Epidemiol Sante Publique 2022;70:1-8. [PMID: 35027236 DOI: 10.1016/j.respe.2021.11.019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Revised: 03/11/2021] [Accepted: 11/22/2021] [Indexed: 11/18/2022] Open

Guttha N, Miao Z, Shamsuddin R. Towards the Development of a Substance Abuse Index (SEI) through Informatics. Healthcare (Basel) 2021;9:healthcare9111596. [PMID: 34828641 PMCID: PMC8620603 DOI: 10.3390/healthcare9111596] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Revised: 11/15/2021] [Accepted: 11/16/2021] [Indexed: 11/16/2022] Open

Explainable ICD multi-label classification of EHRs in Spanish with convolutional attention. Int J Med Inform 2021;157:104615. [PMID: 34741890 DOI: 10.1016/j.ijmedinf.2021.104615] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2021] [Revised: 09/23/2021] [Accepted: 10/08/2021] [Indexed: 01/16/2023]

Laparra E, Mascio A, Velupillai S, Miller T. A Review of Recent Work in Transfer Learning and Domain Adaptation for Natural Language Processing of Electronic Health Records. Yearb Med Inform 2021;30:239-244. [PMID: 34479396 PMCID: PMC8416218 DOI: 10.1055/s-0041-1726522] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Abstract

Objectives: We survey recent work in biomedical NLP on building more adaptable or generalizable models, with a focus on work dealing with electronic health record (EHR) texts, to better understand recent trends in this area and identify opportunities for future research.

Methods: We searched PubMed, the Institute of Electrical and Electronics Engineers (IEEE), the Association for Computational Linguistics (ACL) anthology, the Association for the Advancement of Artificial Intelligence (AAAI) proceedings, and Google Scholar for the years 2018-2020. We reviewed abstracts to identify the most relevant and impactful work, and manually extracted data points from each of these papers to characterize the types of methods and tasks that were studied, in which clinical domains, and current state-of-the-art results.

Results: The ubiquity of pre-trained transformers in clinical NLP research has contributed to an increase in domain adaptation and generalization-focused work that uses these models as the key component. Most recently, work has started to train biomedical transformers and to extend the fine-tuning process with additional domain adaptation techniques. We also highlight recent research in cross-lingual adaptation, as a special case of adaptation.

Conclusions: While pre-trained transformer models have led to some large performance improvements, general domain pre-training does not always transfer adequately to the clinical domain due to its highly specialized language. There is also much work to be done in showing that the gains obtained by pre-trained transformers are beneficial in real world use cases. The amount of work in domain adaptation and transfer learning is limited by dataset availability and creating datasets for new domains is challenging. The growing body of research in languages other than English is encouraging, and more collaboration between researchers across the language divide would likely accelerate progress in non-English clinical NLP.

Collapse

Joo H, Burns M, Kalidaikurichi Lakshmanan SS, Hu Y, Vydiswaran VGV. Neural Machine Translation-Based Automated Current Procedural Terminology Classification System Using Procedure Text: Development and Validation Study. JMIR Form Res 2021;5:e22461. [PMID: 34037526 PMCID: PMC8190648 DOI: 10.2196/22461] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2020] [Revised: 03/02/2021] [Accepted: 04/19/2021] [Indexed: 12/19/2022] Open

Abstract

BACKGROUND

Administrative costs for billing and insurance-related activities in the United States are substantial. One critical cause of the high overhead of administrative costs is medical billing errors. With advanced deep learning techniques, developing advanced models to predict hospital and professional billing codes has become feasible. These models can be used for administrative cost reduction and billing process improvements.

OBJECTIVE

In this study, we aim to develop an automated anesthesiology current procedural terminology (CPT) prediction system that translates manually entered surgical procedure text into standard forms using neural machine translation (NMT) techniques. The standard forms are calculated using similarity scores to predict the most appropriate CPT codes. Although this system aims to enhance medical billing coding accuracy to reduce administrative costs, we compare its performance with that of previously developed machine learning algorithms.

METHODS

We collected and analyzed all operative procedures performed at Michigan Medicine between January 2017 and June 2019 (2.5 years). The first 2 years of data were used to train and validate the existing models and compare the results from the NMT-based model. Data from 2019 (6-month follow-up period) were then used to measure the accuracy of the CPT code prediction. Three experimental settings were designed with different data types to evaluate the models. Experiment 1 used the surgical procedure text entered manually in the electronic health record. Experiment 2 used preprocessing of the procedure text. Experiment 3 used preprocessing of the combined procedure text and preoperative diagnoses. The NMT-based model was compared with the support vector machine (SVM) and long short-term memory (LSTM) models.

RESULTS

The NMT model yielded the highest top-1 accuracy in experiments 1 and 2 at 81.64% and 81.71% compared with the SVM model (81.19% and 81.27%, respectively) and the LSTM model (80.96% and 81.07%, respectively). The SVM model yielded the highest top-1 accuracy of 84.30% in experiment 3, followed by the LSTM model (83.70%) and the NMT model (82.80%). In experiment 3, the addition of preoperative diagnoses showed 3.7%, 3.2%, and 1.3% increases in the SVM, LSTM, and NMT models in top-1 accuracy over those in experiment 2, respectively. For top-3 accuracy, the SVM, LSTM, and NMT models achieved 95.64%, 95.72%, and 95.60% for experiment 1, 95.75%, 95.67%, and 95.69% for experiment 2, and 95.88%, 95.93%, and 95.06% for experiment 3, respectively.

CONCLUSIONS

This study demonstrates the feasibility of creating an automated anesthesiology CPT classification system based on NMT techniques using surgical procedure text and preoperative diagnosis. Our results show that the performance of the NMT-based CPT prediction system is equivalent to that of the SVM and LSTM prediction models. Importantly, we found that including preoperative diagnoses improved the accuracy of using the procedure text alone.

Collapse

Amini S, Zhang L, Hao B, Gupta A, Song M, Karjadi C, Lin H, Kolachalama VB, Au R, Paschalidis IC. An Artificial Intelligence-Assisted Method for Dementia Detection Using Images from the Clock Drawing Test. J Alzheimers Dis 2021;83:581-589. [PMID: 34334396 PMCID: PMC9049046 DOI: 10.3233/jad-210299] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]

An ensemble unsupervised spiking neural network for objective recognition. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2020.07.109] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]

Noh J, Kavuluru R. Literature Retrieval for Precision Medicine with Neural Matching and Faceted Summarization. PROCEEDINGS OF THE CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING. CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING 2020;2020:3389-3399. [PMID: 34541588 PMCID: PMC8444997 DOI: 10.18653/v1/2020.findings-emnlp.304] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]

Letourneau-Guillon L, Camirand D, Guilbert F, Forghani R. Artificial Intelligence Applications for Workflow, Process Optimization and Predictive Analytics. Neuroimaging Clin N Am 2020;30:e1-e15. [PMID: 33039002 DOI: 10.1016/j.nic.2020.08.008] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]

Automated ICD-10 code assignment of nonstandard diagnoses via a two-stage framework. Artif Intell Med 2020;108:101939. [PMID: 32972666 DOI: 10.1016/j.artmed.2020.101939] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2020] [Revised: 07/18/2020] [Accepted: 08/07/2020] [Indexed: 11/22/2022]

Zhou L, Cheng C, Ou D, Huang H. Construction of a semi-automatic ICD-10 coding system. BMC Med Inform Decis Mak 2020;20:67. [PMID: 32293423 PMCID: PMC7157985 DOI: 10.1186/s12911-020-1085-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2019] [Accepted: 03/30/2020] [Indexed: 01/29/2023] Open

Abstract

Background

The International Classification of Diseases, 10th Revision (ICD-10) has been widely used to describe the diagnosis information of patients. Automatic ICD-10 coding is important because manually assigning codes is expensive, time consuming and error prone. Although numerous approaches have been developed to explore automatic coding, few of them have been applied in practice. Our aim is to construct a practical, automatic ICD-10 coding machine to improve coding efficiency and quality in daily work.

Methods

In this study, we propose the use of regular expressions (regexps) to establish a correspondence between diagnosis codes and diagnosis descriptions in outpatient settings and at admission and discharge. The description models of the regexps were embedded in our upgraded coding system, which queries a diagnosis description and assigns a unique diagnosis code. Like most studies, the precision (P), recall (R), F-measure (F) and overall accuracy (A) were used to evaluate the system performance. Our study had two stages. The datasets were obtained from the diagnosis information on the homepage of the discharge medical record. The testing sets were from October 1, 2017 to April 30, 2018 and from July 1, 2018 to January 31, 2019.

Results

The values of P were 89.27 and 88.38% in the first testing phase and the second testing phase, respectively, which demonstrate high precision. The automatic ICD-10 coding system completed more than 160,000 codes in 16 months, which reduced the workload of the coders. In addition, a comparison between the amount of time needed for manual coding and automatic coding indicated the effectiveness of the system-the time needed for automatic coding takes nearly 100 times less than manual coding.

Conclusions

Our automatic coding system is well suited for the coding task. Further studies are warranted to perfect the description models of the regexps and to develop synthetic approaches to improve system performance.

Collapse

Rios A, Durbin EB, Hands I, Arnold SM, Shah D, Schwartz SM, Goulart BHL, Kavuluru R. Cross-registry neural domain adaptation to extract mutational test results from pathology reports. J Biomed Inform 2019;97:103267. [PMID: 31401235 PMCID: PMC6736690 DOI: 10.1016/j.jbi.2019.103267] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2019] [Revised: 07/30/2019] [Accepted: 08/05/2019] [Indexed: 10/26/2022]

Abstract

OBJECTIVE

We study the performance of machine learning (ML) methods, including neural networks (NNs), to extract mutational test results from pathology reports collected by cancer registries. Given the lack of hand-labeled datasets for mutational test result extraction, we focus on the particular use-case of extracting Epidermal Growth Factor Receptor mutation results in non-small cell lung cancers. We explore the generalization of NNs across different registries where our goals are twofold: (1) to assess how well models trained on a registry's data port to test data from a different registry and (2) to assess whether and to what extent such models can be improved using state-of-the-art neural domain adaptation techniques under different assumptions about what is available (labeled vs unlabeled data) at the target registry site.

MATERIALS AND METHODS

We collected data from two registries: the Kentucky Cancer Registry (KCR) and the Fred Hutchinson Cancer Research Center (FH) Cancer Surveillance System. We combine NNs with adversarial domain adaptation to improve cross-registry performance. We compare to other classifiers in the standard supervised classification, unsupervised domain adaptation, and supervised domain adaptation scenarios.

RESULTS

The performance of ML methods varied between registries. To extract positive results, the basic convolutional neural network (CNN) had an F1 of 71.5% on the KCR dataset and 95.7% on the FH dataset. For the KCR dataset, the CNN F1 results were low when trained on FH data (Positive F1: 23%). Using our proposed adversarial CNN, without any labeled data, we match the F1 of the models trained directly on each target registry's data. The adversarial CNN F1 improved when trained on FH and applied to KCR dataset (Positive F1: 70.8%). We found similar performance improvements when we trained on KCR and tested on FH reports (Positive F1: 45% to 96%).

CONCLUSION

Adversarial domain adaptation improves the performance of NNs applied to pathology reports. In the unsupervised domain adaptation setting, we match the performance of models that are trained directly on target registry's data by using source registry's labeled data and unlabeled examples from the target registry.

Collapse