51
|
Feki I, Ammar S, Kessentini Y, Muhammad K. Federated learning for COVID-19 screening from Chest X-ray images. Appl Soft Comput 2021; 106:107330. [PMID: 33776607 PMCID: PMC7979273 DOI: 10.1016/j.asoc.2021.107330] [Citation(s) in RCA: 64] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Revised: 02/17/2021] [Accepted: 03/16/2021] [Indexed: 12/14/2022]
Abstract
Today, the whole world is facing a great medical disaster that affects the health and lives of the people: the COVID-19 disease, colloquially known as the Corona virus. Deep learning is an effective means to assist radiologists to analyze the vast amount of chest X-ray images, which can potentially have a substantial role in streamlining and accelerating the diagnosis of COVID-19. Such techniques involve large datasets for training and all such data must be centralized in order to be processed. Due to medical data privacy regulations, it is often not possible to collect and share patient data in a centralized data server. In this work, we present a collaborative federated learning framework allowing multiple medical institutions screening COVID-19 from Chest X-ray images using deep learning without sharing patient data. We investigate several key properties and specificities of federated learning setting including the not independent and identically distributed (non-IID) and unbalanced data distributions that naturally arise. We experimentally demonstrate that the proposed federated learning framework provides competitive results to that of models trained by sharing data, considering two different model architectures. These findings would encourage medical institutions to adopt collaborative process and reap benefits of the rich private data in order to rapidly build a powerful model for COVID-19 screening.
Collapse
Affiliation(s)
- Ines Feki
- Digital Research Center of Sfax, B.P. 275, Sakiet Ezzit, 3021 Sfax, Tunisia
| | - Sourour Ammar
- Digital Research Center of Sfax, B.P. 275, Sakiet Ezzit, 3021 Sfax, Tunisia.,SM@RTS : Laboratory of Signals, systeMs, aRtificial Intelligence and neTworkS, Sfax, Tunisia
| | - Yousri Kessentini
- Digital Research Center of Sfax, B.P. 275, Sakiet Ezzit, 3021 Sfax, Tunisia.,SM@RTS : Laboratory of Signals, systeMs, aRtificial Intelligence and neTworkS, Sfax, Tunisia
| | - Khan Muhammad
- Department of Software, Sejong University, Seoul 143-747, Republic of Korea
| |
Collapse
|
52
|
Si Y, Du J, Li Z, Jiang X, Miller T, Wang F, Jim Zheng W, Roberts K. Deep representation learning of patient data from Electronic Health Records (EHR): A systematic review. J Biomed Inform 2021; 115:103671. [PMID: 33387683 PMCID: PMC11290708 DOI: 10.1016/j.jbi.2020.103671] [Citation(s) in RCA: 59] [Impact Index Per Article: 14.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2020] [Revised: 10/23/2020] [Accepted: 12/23/2020] [Indexed: 12/22/2022]
Abstract
OBJECTIVES Patient representation learning refers to learning a dense mathematical representation of a patient that encodes meaningful information from Electronic Health Records (EHRs). This is generally performed using advanced deep learning methods. This study presents a systematic review of this field and provides both qualitative and quantitative analyses from a methodological perspective. METHODS We identified studies developing patient representations from EHRs with deep learning methods from MEDLINE, EMBASE, Scopus, the Association for Computing Machinery (ACM) Digital Library, and the Institute of Electrical and Electronics Engineers (IEEE) Xplore Digital Library. After screening 363 articles, 49 papers were included for a comprehensive data collection. RESULTS Publications developing patient representations almost doubled each year from 2015 until 2019. We noticed a typical workflow starting with feeding raw data, applying deep learning models, and ending with clinical outcome predictions as evaluations of the learned representations. Specifically, learning representations from structured EHR data was dominant (37 out of 49 studies). Recurrent Neural Networks were widely applied as the deep learning architecture (Long short-term memory: 13 studies, Gated recurrent unit: 11 studies). Learning was mainly performed in a supervised manner (30 studies) optimized with cross-entropy loss. Disease prediction was the most common application and evaluation (31 studies). Benchmark datasets were mostly unavailable (28 studies) due to privacy concerns of EHR data, and code availability was assured in 20 studies. DISCUSSION & CONCLUSION The existing predictive models mainly focus on the prediction of single diseases, rather than considering the complex mechanisms of patients from a holistic review. We show the importance and feasibility of learning comprehensive representations of patient EHR data through a systematic review. Advances in patient representation learning techniques will be essential for powering patient-level EHR analyses. Future work will still be devoted to leveraging the richness and potential of available EHR data. Reproducibility and transparency of reported results will hopefully improve. Knowledge distillation and advanced learning techniques will be exploited to assist the capability of learning patient representation further.
Collapse
Affiliation(s)
- Yuqi Si
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, TX, USA
| | - Jingcheng Du
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, TX, USA
| | - Zhao Li
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, TX, USA
| | - Xiaoqian Jiang
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, TX, USA
| | - Timothy Miller
- Computational Health Informatics Program (CHIP), Boston Children's Hospital and Harvard Medical School, MA, USA
| | - Fei Wang
- Department of Population Health Sciences. Weill Cornell Medicine, Cornell University, NY, USA
| | - W Jim Zheng
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, TX, USA
| | - Kirk Roberts
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, TX, USA.
| |
Collapse
|
53
|
Lopez Pineda A, Pourshafeie A, Ioannidis A, Leibold CM, Chan AL, Bustamante CD, Frankovich J, Wojcik GL. Discovering prescription patterns in pediatric acute-onset neuropsychiatric syndrome patients. J Biomed Inform 2020; 113:103664. [PMID: 33359113 DOI: 10.1016/j.jbi.2020.103664] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2020] [Revised: 10/28/2020] [Accepted: 12/10/2020] [Indexed: 11/28/2022]
Abstract
OBJECTIVE Pediatric acute-onset neuropsychiatric syndrome (PANS) is a complex neuropsychiatric syndrome characterized by an abrupt onset of obsessive-compulsive symptoms and/or severe eating restrictions, along with at least two concomitant debilitating cognitive, behavioral, or neurological symptoms. A wide range of pharmacological interventions along with behavioral and environmental modifications, and psychotherapies have been adopted to treat symptoms and underlying etiologies. Our goal was to develop a data-driven approach to identify treatment patterns in this cohort. MATERIALS AND METHODS In this cohort study, we extracted medical prescription histories from electronic health records. We developed a modified dynamic programming approach to perform global alignment of those medication histories. Our approach is unique since it considers time gaps in prescription patterns as part of the similarity strategy. RESULTS This study included 43 consecutive new-onset pre-pubertal patients who had at least 3 clinic visits. Our algorithm identified six clusters with distinct medication usage history which may represent clinician's practice of treating PANS of different severities and etiologies i.e., two most severe groups requiring high dose intravenous steroids; two arthritic or inflammatory groups requiring prolonged nonsteroidal anti-inflammatory drug (NSAID); and two mild relapsing/remitting group treated with a short course of NSAID. The psychometric scores as outcomes in each cluster generally improved within the first two years. DISCUSSION AND CONCLUSION Our algorithm shows potential to improve our knowledge of treatment patterns in the PANS cohort, while helping clinicians understand how patients respond to a combination of drugs.
Collapse
Affiliation(s)
- Arturo Lopez Pineda
- Department of Biomedical Data Science, Stanford University, CA, USA; Department of Data Science, Amphora Health, Morelia, Mexico
| | - Armin Pourshafeie
- Department of Biomedical Data Science, Stanford University, CA, USA; Department of Physics, Stanford University, CA, USA
| | | | - Collin McCloskey Leibold
- Department of Pediatrics, Division of Allergy, Immunology, and Rheumatology, Stanford University, CA, USA; Department of Medicine, University of Massachusetts Medical School, Worcester, MA, USA
| | - Avis L Chan
- Department of Pediatrics, Division of Allergy, Immunology, and Rheumatology, Stanford University, CA, USA
| | - Carlos D Bustamante
- Department of Biomedical Data Science, Stanford University, CA, USA; Department of Genetics, Stanford University, CA, USA; Chan Zuckerberg Biohub, San Francisco, CA, USA.
| | - Jennifer Frankovich
- Department of Pediatrics, Division of Allergy, Immunology, and Rheumatology, Stanford University, CA, USA.
| | - Genevieve L Wojcik
- Department of Biomedical Data Science, Stanford University, CA, USA; Department of Epidemiology, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD, USA.
| |
Collapse
|
54
|
Xu J, Glicksberg BS, Su C, Walker P, Bian J, Wang F. Federated Learning for Healthcare Informatics. JOURNAL OF HEALTHCARE INFORMATICS RESEARCH 2020; 5:1-19. [PMID: 33204939 PMCID: PMC7659898 DOI: 10.1007/s41666-020-00082-4] [Citation(s) in RCA: 232] [Impact Index Per Article: 46.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2020] [Revised: 10/21/2020] [Accepted: 10/30/2020] [Indexed: 01/02/2023]
Abstract
With the rapid development of computer software and hardware technologies, more and more healthcare data are becoming readily available from clinical institutions, patients, insurance companies, and pharmaceutical industries, among others. This access provides an unprecedented opportunity for data science technologies to derive data-driven insights and improve the quality of care delivery. Healthcare data, however, are usually fragmented and private making it difficult to generate robust results across populations. For example, different hospitals own the electronic health records (EHR) of different patient populations and these records are difficult to share across hospitals because of their sensitive nature. This creates a big barrier for developing effective analytical approaches that are generalizable, which need diverse, “big data.” Federated learning, a mechanism of training a shared global model with a central server while keeping all the sensitive data in local institutions where the data belong, provides great promise to connect the fragmented healthcare data sources with privacy-preservation. The goal of this survey is to provide a review for federated learning technologies, particularly within the biomedical space. In particular, we summarize the general solutions to the statistical challenges, system challenges, and privacy issues in federated learning, and point out the implications and potentials in healthcare.
Collapse
Affiliation(s)
- Jie Xu
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY USA
| | - Benjamin S Glicksberg
- Institute for Digital Health, Icahn School of Medicine at Mount Sinai, New York, NY USA
| | - Chang Su
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY USA
| | - Peter Walker
- U.S. Department of Defense Joint Artificial Intelligence Center, Washington, D.C., USA
| | - Jiang Bian
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL USA
| | - Fei Wang
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY USA
| |
Collapse
|
55
|
Park CW, Seo SW, Kang N, Ko B, Choi BW, Park CM, Chang DK, Kim H, Kim H, Lee H, Jang J, Ye JC, Jeon JH, Seo JB, Kim KJ, Jung KH, Kim N, Paek S, Shin SY, Yoo S, Choi YS, Kim Y, Yoon HJ. Artificial Intelligence in Health Care: Current Applications and Issues. J Korean Med Sci 2020; 35:e379. [PMID: 33140591 PMCID: PMC7606883 DOI: 10.3346/jkms.2020.35.e379] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/26/2020] [Accepted: 09/23/2020] [Indexed: 12/11/2022] Open
Abstract
In recent years, artificial intelligence (AI) technologies have greatly advanced and become a reality in many areas of our daily lives. In the health care field, numerous efforts are being made to implement the AI technology for practical medical treatments. With the rapid developments in machine learning algorithms and improvements in hardware performances, the AI technology is expected to play an important role in effectively analyzing and utilizing extensive amounts of health and medical data. However, the AI technology has various unique characteristics that are different from the existing health care technologies. Subsequently, there are a number of areas that need to be supplemented within the current health care system for the AI to be utilized more effectively and frequently in health care. In addition, the number of medical practitioners and public that accept AI in the health care is still low; moreover, there are various concerns regarding the safety and reliability of AI technology implementations. Therefore, this paper aims to introduce the current research and application status of AI technology in health care and discuss the issues that need to be resolved.
Collapse
Affiliation(s)
- Chan Woo Park
- Department of Orthopedic Surgery, Samsung Medical Center, School of Medicine, Sungkyunkwan University, Seoul, Korea
| | - Sung Wook Seo
- Department of Orthopedic Surgery, Samsung Medical Center, School of Medicine, Sungkyunkwan University, Seoul, Korea
| | - Noeul Kang
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, Samsung Medical Center, School of Medicine, Sungkyunkwan University, Seoul, Korea
| | - BeomSeok Ko
- Department of Surgery, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea
| | - Byung Wook Choi
- Department of Radiology, Severance Hospital, Yonsei University College of Medicine, Seoul, Korea
| | - Chang Min Park
- Department of Radiology, Seoul National University College of Medicine, Seoul, Korea
| | - Dong Kyung Chang
- Division of Gastroenterology, Department of Medicine, Samsung Medical Center, School of Medicine, Sungkyunkwan University, Seoul, Korea
| | - Hwiyoung Kim
- Department of Radiology, Severance Hospital, Yonsei University College of Medicine, Seoul, Korea
| | - Hyunchul Kim
- Department of R&D Planning, Korea Health Industry Development Institute (KHIDI), Cheongju, Korea
| | - Hyunna Lee
- Health Innovation Big Data Center, Asan Institute for Life Science, Asan Medical Center, Seoul, Korea
| | - Jinhee Jang
- Department of Radiology, Seoul St. Mary's Hospital, College of Medicine, The Catholic University of Korea, Seoul, Korea
| | - Jong Chul Ye
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Korea
| | - Jong Hong Jeon
- Protocol Engineering Center, Electronics and Telecommunications Research Institute (ETRI), Daejeon, Korea
| | - Joon Beom Seo
- Department of Radiology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea
| | - Kwang Joon Kim
- Division of Geriatrics, Department of Internal Medicine, Severance Hospital, Yonsei University College of Medicine, Seoul, Korea
| | | | - Namkug Kim
- Department of Convergence Medicine, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea
| | | | - Soo Yong Shin
- Big Data Research Center, Samsung Medical Center, School of Medicine, Sungkyunkwan University, Seoul, Korea
| | - Soyoung Yoo
- Health Innovation Big Data Center, Asan Institute for Life Science, Asan Medical Center, Seoul, Korea
| | | | - Youngjun Kim
- Center for Bionics, Korea Institute of Science and Technology (KIST), Seoul, Korea
| | - Hyung Jin Yoon
- Department of Biomedical Engineering, Seoul National University College of Medicine, Seoul, Korea.
| |
Collapse
|
56
|
Rieke N, Hancox J, Li W, Milletarì F, Roth HR, Albarqouni S, Bakas S, Galtier MN, Landman BA, Maier-Hein K, Ourselin S, Sheller M, Summers RM, Trask A, Xu D, Baust M, Cardoso MJ. The future of digital health with federated learning. NPJ Digit Med 2020; 3:119. [PMID: 33015372 PMCID: PMC7490367 DOI: 10.1038/s41746-020-00323-1] [Citation(s) in RCA: 647] [Impact Index Per Article: 129.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2020] [Accepted: 08/12/2020] [Indexed: 12/17/2022] Open
Abstract
Data-driven machine learning (ML) has emerged as a promising approach for building accurate and robust statistical models from medical data, which is collected in huge volumes by modern healthcare systems. Existing medical data is not fully exploited by ML primarily because it sits in data silos and privacy concerns restrict access to this data. However, without access to sufficient data, ML will be prevented from reaching its full potential and, ultimately, from making the transition from research to clinical practice. This paper considers key factors contributing to this issue, explores how federated learning (FL) may provide a solution for the future of digital health and highlights the challenges and considerations that need to be addressed.
Collapse
Affiliation(s)
- Nicola Rieke
- NVIDIA GmbH, Munich, Germany
- Technical University of Munich (TUM), Munich, Germany
| | | | | | | | | | - Shadi Albarqouni
- Technical University of Munich (TUM), Munich, Germany
- Imperial College London, London, UK
| | - Spyridon Bakas
- University of Pennsylvania (UPenn), Philadelphia, PA USA
| | | | | | - Klaus Maier-Hein
- German Cancer Research Center (DKFZ), Heidelberg, Germany
- Heidelberg University Hospital, Heidelberg, Germany
| | | | | | - Ronald M. Summers
- Clinical Center, National Institutes of Health (NIH), Bethesda, MD USA
| | - Andrew Trask
- OpenMined, Oxford, UK
- University of Oxford, Oxford, UK
- Centre for the Governance of AI (GovAI), Oxford, UK
| | | | | | | |
Collapse
|
57
|
Kapa S, Halamka J, Raskar R. Contact Tracing to Manage COVID-19 Spread-Balancing Personal Privacy and Public Health. Mayo Clin Proc 2020; 95:1320-1322. [PMID: 32622440 PMCID: PMC7196395 DOI: 10.1016/j.mayocp.2020.04.031] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/14/2020] [Revised: 04/21/2020] [Accepted: 04/28/2020] [Indexed: 02/03/2023]
Affiliation(s)
- Suraj Kapa
- Department of Cardiovascular Medicine, Mayo Clinic, Rochester, MN.
| | - John Halamka
- Mayo Clinic Platform, Mayo Clinic, Rochester, MN
| | - Ramesh Raskar
- MIT Media Lab, Massachusetts Institute of Technology, Cambridge, MA
| |
Collapse
|
58
|
Goecks J, Jalili V, Heiser LM, Gray JW. How Machine Learning Will Transform Biomedicine. Cell 2020; 181:92-101. [PMID: 32243801 PMCID: PMC7141410 DOI: 10.1016/j.cell.2020.03.022] [Citation(s) in RCA: 268] [Impact Index Per Article: 53.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2020] [Revised: 03/07/2020] [Accepted: 03/09/2020] [Indexed: 12/15/2022]
Abstract
This Perspective explores the application of machine learning toward improved diagnosis and treatment. We outline a vision for how machine learning can transform three broad areas of biomedicine: clinical diagnostics, precision treatments, and health monitoring, where the goal is to maintain health through a range of diseases and the normal aging process. For each area, early instances of successful machine learning applications are discussed, as well as opportunities and challenges for machine learning. When these challenges are met, machine learning promises a future of rigorous, outcomes-based medicine with detection, diagnosis, and treatment strategies that are continuously adapted to individual and environmental differences.
Collapse
Affiliation(s)
- Jeremy Goecks
- Department of Biomedical Engineering, Oregon Health & Science University, Portland, OR, USA.
| | - Vahid Jalili
- Department of Biomedical Engineering, Oregon Health & Science University, Portland, OR, USA
| | - Laura M Heiser
- Department of Biomedical Engineering, Oregon Health & Science University, Portland, OR, USA
| | - Joe W Gray
- Department of Biomedical Engineering, Oregon Health & Science University, Portland, OR, USA
| |
Collapse
|
59
|
Liu Y, Tian M, Xu C, Zhao L. Neural network feature learning based on image self-encoding. INT J ADV ROBOT SYST 2020. [DOI: 10.1177/1729881420921653] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
With the rapid development of information technology and the arrival of the era of big data, people’s access to information is increasingly relying on information such as images. Today, image data are showing an increasing trend in the form of an index. How to use deep learning models to extract valuable information from massive data is very important. In the face of such a situation, people cannot accurately and timely find out the information they need. Therefore, the research on image retrieval technology is very important. Image retrieval is an important technology in the field of computer vision image processing. It realizes fast and accurate query of similar images in image database. The excellent feature representation not only can represent the category information of the image but also capture the relevant semantic information of the image. If the neural network feature learning expression is combined with the image retrieval field, it will definitely improve the application of image retrieval technology. To solve the above problems, this article studies the problems encountered in deep learning neural network feature learning based on image self-encoding and discusses its feature expression in the field of image retrieval. By adding the spatial relationship information obtained by image self-encoding in the neural network training process, the feature expression ability of the selected neural network is improved, and the neural network feature learning based on image coding is successfully applied to the popular field of image retrieval.
Collapse
Affiliation(s)
- Yangyang Liu
- School of Mechanical Engineering and Electronic Information, China University of Geosciences, Wuhan, Hubei, China
| | - Minghua Tian
- College of Electronic Information Engineering, Inner Mongolia University, Hohhot, Neimenggu, China
| | - Chang Xu
- School of Mechanical Engineering and Electronic Information, China University of Geosciences, Wuhan, Hubei, China
| | - Lixiang Zhao
- School of Mechanical Engineering and Electronic Information, China University of Geosciences, Wuhan, Hubei, China
| |
Collapse
|
60
|
A patient-similarity-based model for diagnostic prediction. Int J Med Inform 2019; 135:104073. [PMID: 31923816 DOI: 10.1016/j.ijmedinf.2019.104073] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2019] [Revised: 11/26/2019] [Accepted: 12/30/2019] [Indexed: 12/28/2022]
Abstract
OBJECTIVE To simulate the clinical reasoning of doctors, retrieve analogous patients of an index patient automatically and predict diagnoses by the similar/dissimilar patients. METHODS We proposed a novel patient-similarity-based framework for diagnostic prediction, which is inspired by the structure-mapping theory about analogy reasoning in psychology. Patient similarity is defined as the similarity between two patients' diagnoses sets rather than a dichotomous (absence/presence of just one disease). The multilabel classification problem is converted to a single-value regression problem by integrating the pairwise patients' clinical features into a vector and taking the vector as the input and the patient similarity as the output. In contrast to the common k-NN method which only considering the nearest neighbors, we not only utilize similar patients (positive analogy) to generate diagnostic hypotheses, but also utilize dissimilar patients (negative analogy) are used to reject diagnostic hypotheses. RESULTS The patient-similarity-based models perform better than the one-vs-all baseline and traditional k-NN methods. The f-1 score of positive-analogy-based prediction is 0.698, significantly higher than the scores of baselines ranging from 0.368 to 0.661. It increases to 0.703 when the negative analogy method is applied to modify the prediction results of positive analogy. The performance of this method is highly promising for larger datasets. CONCLUSION The patient-similarity-based model provides diagnostic decision support that is more accurate, generalizable, and interpretable than those of previous methods and is based on heterogeneous and incomplete data. The model also serves as a new application for the use of clinical big data through artificial intelligence technology.
Collapse
|
61
|
Jackson G, Hu J. Artificial Intelligence in Health in 2018: New Opportunities, Challenges, and Practical Implications. Yearb Med Inform 2019; 28:52-54. [PMID: 31419815 PMCID: PMC6697508 DOI: 10.1055/s-0039-1677925] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
Objective
: To summarize significant research contributions to the field of artificial intelligence (AI) in health in 2018.
Methods
: Ovid MEDLINE
®
and Web of Science
®
databases were searched to identify original research articles that were published in the English language during 2018 and presented advances in the science of AI applied in health. Queries employed Medical Subject Heading (MeSH
®
) terms and keywords representing AI methodologies and limited results to health applications. Section editors selected 15 best paper candidates that underwent peer review by internationally renowned domain experts. Final best papers were selected by the editorial board of the 2018 International Medical Informatics Association (IMIA) Yearbook.
Results
: Database searches returned 1,480 unique publications. Best papers employed innovative AI techniques that incorporated domain knowledge or explored approaches to support distributed or federated learning. All top-ranked papers incorporated novel approaches to advance the science of AI in health and included rigorous evaluations of their methodologies.
Conclusions
: Performance of state-of-the-art AI machine learning algorithms can be enhanced by approaches that employ a multidisciplinary biomedical informatics pipeline to incorporate domain knowledge and can overcome challenges such as sparse, missing, or inconsistent data. Innovative training heuristics and encryption techniques may support distributed learning with preservation of privacy.
Collapse
Affiliation(s)
- Gretchen Jackson
- IBM Watson Health, Cambridge, Massachusetts, USA.,Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Jianying Hu
- IBM Research, Yorktown Heights, New York, USA
| | | |
Collapse
|
62
|
Abstract
INTRODUCTION Artificial intelligence (AI) technologies continue to attract interest from a broad range of disciplines in recent years, including health. The increase in computer hardware and software applications in medicine, as well as digitization of health-related data together fuel progress in the development and use of AI in medicine. This progress provides new opportunities and challenges, as well as directions for the future of AI in health. OBJECTIVE The goals of this survey are to review the current state of AI in health, along with opportunities, challenges, and practical implications. This review highlights recent developments over the past five years and directions for the future. METHODS Publications over the past five years reporting the use of AI in health in clinical and biomedical informatics journals, as well as computer science conferences, were selected according to Google Scholar citations. Publications were then categorized into five different classes, according to the type of data analyzed. RESULTS The major data types identified were multi-omics, clinical, behavioral, environmental and pharmaceutical research and development (R&D) data. The current state of AI related to each data type is described, followed by associated challenges and practical implications that have emerged over the last several years. Opportunities and future directions based on these advances are discussed. CONCLUSION Technologies have enabled the development of AI-assisted approaches to healthcare. However, there remain challenges. Work is currently underway to address multi-modal data integration, balancing quantitative algorithm performance and qualitative model interpretability, protection of model security, federated learning, and model bias.
Collapse
Affiliation(s)
- Fei Wang
- Division of Health Informatics, Department of Healthcare Policy and Research, Weill Cornell Medicine, Cornell University, NY, USA
| | | |
Collapse
|
63
|
Wang Y, Wen A, Liu S, Hersh W, Bedrick S, Liu H. Test collections for electronic health record-based clinical information retrieval. JAMIA Open 2019; 2:360-368. [PMID: 31709390 PMCID: PMC6824517 DOI: 10.1093/jamiaopen/ooz016] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2019] [Revised: 04/26/2019] [Accepted: 04/03/2019] [Indexed: 01/03/2023] Open
Abstract
Objectives To create test collections for evaluating clinical information retrieval (IR) systems and advancing clinical IR research. Materials and Methods Electronic health record (EHR) data, including structured and free-text data, from 45 000 patients who are a part of the Mayo Clinic Biobank cohort was retrieved from the clinical data warehouse. The clinical IR system indexed a total of 42 million free-text EHR documents. The search queries consisted of 56 topics developed through a collaboration between Mayo Clinic and Oregon Health & Science University. We described the creation of test collections, including a to-be-evaluated document pool using five retrieval models, and human assessment guidelines. We analyzed the relevance judgment results in terms of human agreement and time spent, and results of three levels of relevance, and reported performance of five retrieval models. Results The two judges had a moderate overall agreement with a Kappa value of 0.49, spent a consistent amount of time judging the relevance, and were able to identify easy and difficult topics. The conventional retrieval model performed best on most topics while a concept-based retrieval model had better performance on the topics requiring conceptual level retrieval. Discussion IR can provide an alternate approach to leveraging clinical narratives for patient information discovery as it is less dependent on semantics. Our study showed the feasibility of test collections along with a few challenges. Conclusion The conventional test collections for evaluating the IR system show potential for successfully evaluating clinical IR systems with a few challenges to be investigated.
Collapse
Affiliation(s)
- Yanshan Wang
- Department of Health Sciences Research, Mayo Clinic, Rochester, Minnesota, USA
| | - Andrew Wen
- Department of Health Sciences Research, Mayo Clinic, Rochester, Minnesota, USA
| | - Sijia Liu
- Department of Health Sciences Research, Mayo Clinic, Rochester, Minnesota, USA
| | - William Hersh
- Department of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, Oregon, USA
| | - Steven Bedrick
- Department of Computer Science and Electrical Engineering, Oregon Health & Science University, Portland, Oregon, USA
| | - Hongfang Liu
- Department of Health Sciences Research, Mayo Clinic, Rochester, Minnesota, USA
| |
Collapse
|
64
|
Li Z, Roberts K, Jiang X, Long Q. Distributed learning from multiple EHR databases: Contextual embedding models for medical events. J Biomed Inform 2019; 92:103138. [PMID: 30825539 DOI: 10.1016/j.jbi.2019.103138] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2018] [Revised: 02/15/2019] [Accepted: 02/16/2019] [Indexed: 11/26/2022]
Abstract
Electronic health record (EHR) data provide promising opportunities to explore personalized treatment regimes and to make clinical predictions. Compared with regular clinical data, EHR data are known for their irregularity and complexity. In addition, analyzing EHR data involves privacy issues and sharing such data is often infeasible among multiple research sites due to regulatory and other hurdles. A recently published work uses contextual embedding models and successfully builds one predictive model for more than seventy common diagnoses. Despite of the high predictive power, the model cannot be generalized to other institutions without sharing data. In this work, a novel method is proposed to learn from multiple databases and build predictive models based on Distributed Noise Contrastive Estimation (Distributed NCE). We use differential privacy to safeguard the intermediary information sharing. The numerical study with a real dataset demonstrates that the proposed method not only can build predictive models in a distributed manner with privacy protection, but also preserve model structure well and achieve comparable prediction accuracy. The proposed methods have been implemented as a stand-alone Python library and the implementation is available on Github (https://github.com/ziyili20/DistributedLearningPredictor) with installation instructions and use-cases.
Collapse
Affiliation(s)
- Ziyi Li
- Emory University, Department of Biostatistics and Bioinformatics, Atlanta, GA 30332, USA
| | - Kirk Roberts
- University of Texas, Health Science Center at Houston, School of Biomedical Informatics, Houston, TX 77030, USA
| | - Xiaoqian Jiang
- University of Texas, Health Science Center at Houston, School of Biomedical Informatics, Houston, TX 77030, USA.
| | - Qi Long
- University of Pennsylvania, Perelman School of Medicine, Department of Biostatistics, Epidemiology and Informatics, Philadelphia, PA 19104, USA.
| |
Collapse
|