1
|
Schiano di Cola V, Chiaro D, Prezioso E, Izzo S, Giampaolo F. Insight Extraction From E-Health Bookings by Means of Hypergraph and Machine Learning. IEEE J Biomed Health Inform 2023; 27:4649-4659. [PMID: 37018305 DOI: 10.1109/jbhi.2022.3233498] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
New technologies are transforming medicine, and this revolution starts with data. Usually, health services within public healthcare systems are accessed through a booking centre managed by local health authorities and controlled by the regional government. In this perspective, structuring e-health data through a Knowledge Graph (KG) approach can provide a feasible method to quickly and simply organize data and/or retrieve new information. Starting from raw health bookings data from the public healthcare system in Italy, a KG method is presented to support e-health services through the extraction of medical knowledge and novel insights. By exploiting graph embedding which arranges the various attributes of the entities into the same vector space, we are able to apply Machine Learning (ML) techniques to the embedded vectors. The findings suggest that KGs could be used to assess patients' medical booking patterns, either from unsupervised or supervised ML. In particular, the former can determine possible presence of hidden groups of entities that is not immediately available through the original legacy dataset structure. The latter, although the performance of the used algorithms is not very high, shows encouraging results in predicting a patient's likelihood to undergo a particular medical visit within a year. However, many technological advances remain to be made, especially in graph database technologies and graph embedding algorithms.
Collapse
|
2
|
Zhao G, Gu W, Cai W, Zhao Z, Zhang X, Liu J. MLEE: A method for extracting object-level medical knowledge graph entities from Chinese clinical records. Front Genet 2022; 13:900242. [PMID: 35938002 PMCID: PMC9354090 DOI: 10.3389/fgene.2022.900242] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2022] [Accepted: 06/16/2022] [Indexed: 11/13/2022] Open
Abstract
As a typical knowledge-intensive industry, the medical field uses knowledge graph technology to construct causal inference calculations, such as “symptom-disease”, “laboratory examination/imaging examination-disease”, and “disease-treatment method”. The continuous expansion of large electronic clinical records provides an opportunity to learn medical knowledge by machine learning. In this process, how to extract entities with a medical logic structure and how to make entity extraction more consistent with the logic of the text content in electronic clinical records are two issues that have become key in building a high-quality, medical knowledge graph. In this work, we describe a method for extracting medical entities using real Chinese clinical electronic clinical records. We define a computational architecture named MLEE to extract object-level entities with “object-attribute” dependencies. We conducted experiments based on randomly selected electronic clinical records of 1,000 patients from Shengjing Hospital of China Medical University to verify the effectiveness of the method.
Collapse
Affiliation(s)
- Genghong Zhao
- School of Computer Science and Engineering Northeastern University, Shenyang, China
- Neusoft Research of Intelligent Healthcare Technology, Shenyang, China
- *Correspondence: Genghong Zhao, ; Xia Zhang, ; Jiren Liu,
| | - Wenjian Gu
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Wei Cai
- Neusoft Research of Intelligent Healthcare Technology, Shenyang, China
| | - Zhiying Zhao
- Department of Clinical Epidemiology, Shengjing Hospital of China Medical University, Shenyang, China
| | - Xia Zhang
- School of Computer Science and Engineering Northeastern University, Shenyang, China
- Neusoft Research of Intelligent Healthcare Technology, Shenyang, China
- *Correspondence: Genghong Zhao, ; Xia Zhang, ; Jiren Liu,
| | - Jiren Liu
- School of Computer Science and Engineering Northeastern University, Shenyang, China
- Neusoft Corporation, Shenyang, China
- *Correspondence: Genghong Zhao, ; Xia Zhang, ; Jiren Liu,
| |
Collapse
|
3
|
Jiang J, Yu X, Lin Y, Guan Y. PercolationDF: A percolation-based medical diagnosis framework. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2022; 19:5832-5849. [PMID: 35603381 DOI: 10.3934/mbe.2022273] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Goal: With the continuing shortage and unequal distribution of medical resources, our objective is to develop a general diagnosis framework that utilizes a smaller amount of electronic medical records (EMRs) to alleviate the problem that the data volume requirement of prevailing models is too vast for medical institutions to afford. Methods: The framework proposed contains network construction, network expansion, and disease diagnosis methods. In the first two stages above, the knowledge extracted from EMRs is utilized to build and expense an EMR-based medical knowledge network (EMKN) to model and represent the medical knowledge. Then, percolation theory is modified to diagnose EMKN. Result: Facing the lack of data, our framework outperforms naïve Bayes networks, neural networks and logistic regression, especially in the top-10 recall. Out of 207 test cases, 51.7% achieved 100% in the top-10 recall, 21% better than what was achieved in one of our previous studies. Conclusion: The experimental results show that the proposed framework may be useful for medical knowledge representation and diagnosis. The framework effectively alleviates the lack of data volume by inferring the knowledge modeled in EMKN. Significance: The proposed framework not only has applications for diagnosis but also may be extended to other domains to represent and model the knowledge and inference on the representation.
Collapse
Affiliation(s)
- Jingchi Jiang
- The Artificial Intelligence Institute, Harbin Institute of Technology, Harbin, China
| | - Xuehui Yu
- The Faculty of Computing, Harbin Institute of Technology, Harbin, China
| | - Yi Lin
- The Faculty of Computing, Harbin Institute of Technology, Harbin, China
| | - Yi Guan
- The Faculty of Computing, Harbin Institute of Technology, Harbin, China
| |
Collapse
|
4
|
AIM in Electronic Health Records (EHRs). Artif Intell Med 2022. [DOI: 10.1007/978-3-030-64573-1_47] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
5
|
Yu G, Chen Z, Wu J, Tan Y. A diagnostic prediction framework on auxiliary medical system for breast cancer in developing countries. Knowl Based Syst 2021. [DOI: 10.1016/j.knosys.2021.107459] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
6
|
Song K, Zeng X, Zhang Y, De Jonckheere J, Yuan X, Koehl L. An interpretable knowledge-based decision support system and its applications in pregnancy diagnosis. Knowl Based Syst 2021. [DOI: 10.1016/j.knosys.2021.106835] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
7
|
Yang Y, Huo H, Jiang J, Sun X, Guan Y, Guo X, Wan X, Liu S. Clinical decision-making framework against over-testing based on modeling implicit evaluation criteria. J Biomed Inform 2021; 119:103823. [PMID: 34044155 DOI: 10.1016/j.jbi.2021.103823] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2020] [Revised: 05/20/2021] [Accepted: 05/21/2021] [Indexed: 12/25/2022]
Abstract
Different statistical methods include various subjective criteria that can prevent over-testing. However, no unified framework that defines generalized objective criteria for various diseases is available to determine the appropriateness of diagnostic tests recommended by doctors. We present the clinical decision-making framework against over-testing based on modeling the implicit evaluation criteria (CDFO-MIEC). The CDFO-MIEC quantifies the subjective evaluation process using statistics-based methods to identify over-testing. Furthermore, it determines the test's appropriateness with extracted entities obtained via named entity recognition and entity alignment. More specifically, implicit evaluation criteria are defined-namely, the correlation among the diagnostic tests, symptoms, and diseases, confirmation function, and exclusion function. Additionally, four evaluation strategies are implemented by applying statistical methods, including the multi-label k-nearest neighbor and the conditional probability algorithms, to model the implicit evaluation criteria. Finally, they are combined into a classification and regression tree to make the final decision. The CDFO-MIEC also provides interpretability by decision conditions for supporting each clinical decision of over-testing. We tested the CDFO-MIEC on 2,860 clinical texts obtained from a single respiratory medicine department in China with the appropriate confirmation by physicians. The dataset was supplemented with random inappropriate tests. The proposed framework excelled against the best competing text classification methods with a Mean_F1 of 0.9167. This determined whether the appropriate and inappropriate tests were properly classified. The four evaluation strategies captured the features effectively, and they were imperative. Therefore, the proposed CDFO-MIEC is feasible because it exhibits high performance and can prevent over-testing.
Collapse
Affiliation(s)
- Yang Yang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China
| | - Hongxing Huo
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China
| | - Jingchi Jiang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China
| | - Xuemei Sun
- Hospital of Harbin Institute of Technology, Harbin 150003, China
| | - Yi Guan
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China.
| | - Xitong Guo
- School of Management, Harbin Institute of Technology, Harbin 150001, China
| | - Xiang Wan
- Shenzhen Research Institute of Big Data, Shenzhen 518000, China
| | - Shengping Liu
- Unisound AI Technology Co., Ltd, Beijing 100083, China
| |
Collapse
|
8
|
Chary M, Boyer EW, Burns MM. Diagnosis of Acute Poisoning using explainable artificial intelligence. Comput Biol Med 2021; 134:104469. [PMID: 34022488 DOI: 10.1016/j.compbiomed.2021.104469] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2021] [Revised: 04/23/2021] [Accepted: 05/01/2021] [Indexed: 11/19/2022]
Abstract
INTRODUCTION Medical toxicology is the clinical specialty that treats the toxic effects of substances, for example, an overdose, a medication error, or a scorpion sting. The volume of toxicological knowledge and research has, as with other medical specialties, outstripped the ability of the individual clinician to entirely master and stay current with it. The application of machine learning/artificial intelligence (ML/AI) techniques to medical toxicology is challenging because initial treatment decisions are often based on a few pieces of textual data and rely heavily on experience and prior knowledge. ML/AI techniques, moreover, often do not represent knowledge in a way that is transparent for the physician, raising barriers to usability. Logic-based systems are more transparent approaches, but often generalize poorly and require expert curation to implement and maintain. METHODS We constructed a probabilistic logic network to model how a toxicologist recognizes a toxidrome, using only physical exam findings. Our approach transparently mimics the knowledge representation and decision-making of practicing clinicians. We created a library of 300 synthetic cases of varying clinical complexity. Each case contained 5 physical exam findings drawn from a mixture of 1 or 2 toxidromes. We used this library to evaluate the performance of our probabilistic logic network, dubbed Tak, against 2 medical toxicologists, a decision tree model, as well as its ability to recover the actual diagnosis. RESULTS The inter-rater reliability between Tak and the consensus of human raters was κ = 0.8432 for straightforward cases, 0.4396 for moderately complex cases, and 0.3331 for challenging cases. The inter-rater reliability between the decision tree classifier and the consensus of human raters was, κ = 0.2522 for straightforward cases, 0.1963 for moderately complex cases and 0.0331 for challenging cases. CONCLUSIONS The software, dubbed Tak, performs comparably to humans on straightforward cases and intermediate difficulty cases, but is outperformed by humans on challenging clinical cases. Tak outperforms a decision tree classifier at all levels of difficulty. Our results are a proof-of-concept that, in a restricted domain, probabilistic logic networks can perform medical reasoning comparably to humans.
Collapse
Affiliation(s)
- Michael Chary
- Weill Cornell Medical Center, New York, NY, USA; Boston Children's Hospital, Boston, MA, USA.
| | - Ed W Boyer
- Brigham and Women's Hospital, Boston, MA, USA
| | | |
Collapse
|
9
|
AIM in Electronic Health Records (EHRs). Artif Intell Med 2021. [DOI: 10.1007/978-3-030-58080-3_47-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
10
|
Guan Y, Jiang J. AIM in Electronic Health Records (EHRs). Artif Intell Med 2021. [DOI: 10.1007/978-3-030-58080-3_47-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
11
|
Medical knowledge embedding based on recursive neural network for multi-disease diagnosis. Artif Intell Med 2019; 103:101772. [PMID: 32143787 DOI: 10.1016/j.artmed.2019.101772] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2018] [Revised: 09/16/2019] [Accepted: 11/26/2019] [Indexed: 12/29/2022]
Abstract
The representation of knowledge based on first-order logic captures the richness of natural language and supports multiple probabilistic inference models. Although symbolic representation enables quantitative reasoning with statistical probability, it is difficult to utilize with machine learning models as they perform numerical operations. In contrast, knowledge embedding (i.e., high-dimensional and continuous vectors) is a feasible approach to complex reasoning that can not only retain the semantic information of knowledge, but also establish the quantifiable relationship among embeddings. In this paper, we propose a recursive neural knowledge network (RNKN), which combines medical knowledge based on first-order logic with a recursive neural network for multi-disease diagnosis. After the RNKN is efficiently trained using manually annotated Chinese Electronic Medical Records (CEMRs), diagnosis-oriented knowledge embeddings and weight matrixes are learned. The experimental results confirm that the diagnostic accuracy of the RNKN is superior to those of four machine learning models, four classical neural networks and Markov logic network. The results also demonstrate that the more explicit the evidence extracted from CEMRs, the better the performance. The RNKN gradually reveals the interpretation of knowledge embeddings as the number of training epochs increases.
Collapse
|
12
|
Shen Y, Li Y, Zheng HT, Tang B, Yang M. Enhancing ontology-driven diagnostic reasoning with a symptom-dependency-aware Naïve Bayes classifier. BMC Bioinformatics 2019; 20:330. [PMID: 31196129 PMCID: PMC6567606 DOI: 10.1186/s12859-019-2924-0] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2018] [Accepted: 05/31/2019] [Indexed: 11/10/2022] Open
Abstract
Background Ontology has attracted substantial attention from both academia and industry. Handling uncertainty reasoning is important in researching ontology. For example, when a patient is suffering from cirrhosis, the appearance of abdominal vein varices is four times more likely than the presence of bitter taste. Such medical knowledge is crucial for decision-making in various medical applications but is missing from existing medical ontologies. In this paper, we aim to discover medical knowledge probabilities from electronic medical record (EMR) texts to enrich ontologies. First, we build an ontology by identifying meaningful entity mentions from EMRs. Then, we propose a symptom-dependency-aware naïve Bayes classifier (SDNB) that is based on the assumption that there is a level of dependency among symptoms. To ensure the accuracy of the diagnostic classification, we incorporate the probability of a disease into the ontology via innovative approaches. Results We conduct a series of experiments to evaluate whether the proposed method can discover meaningful and accurate probabilities for medical knowledge. Based on over 30,000 deidentified medical records, we explore 336 abdominal diseases and 81 related symptoms. Among these 336 gastrointestinal diseases, the probabilities of 31 diseases are obtained via our method. These 31 probabilities of diseases and 189 conditional probabilities between diseases and the symptoms are added into the generated ontology. Conclusion In this paper, we propose a medical knowledge probability discovery method that is based on the analysis and extraction of EMR text data for enriching a medical ontology with probability information. The experimental results demonstrate that the proposed method can effectively identify accurate medical knowledge probability information from EMR data. In addition, the proposed method can efficiently and accurately calculate the probability of a patient suffering from a specified disease, thereby demonstrating the advantage of combining an ontology and a symptom-dependency-aware naïve Bayes classifier.
Collapse
Affiliation(s)
- Ying Shen
- School of Electronics and Computer Engineering, Peking University Shenzhen Graduate School, Shenzhen, 518055, People's Republic of China
| | | | - Hai-Tao Zheng
- School of Information Science and Technology, Graduate School at Shenzhen, Tsinghua University, Shenzhen, 518055, People's Republic of China
| | - Buzhou Tang
- Harbin Institute of Technology (Shenzhen), Shenzhen, 518055, People's Republic of China
| | - Min Yang
- SIAT, Chinese Academy of Sciences, Shenzhen, 518055, People's Republic of China.
| |
Collapse
|
13
|
Jiang J, Xie J, Zhao C, Su J, Guan Y, Yu Q. Max-margin weight learning for medical knowledge network. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2018; 156:179-190. [PMID: 29428070 DOI: 10.1016/j.cmpb.2018.01.005] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/30/2017] [Revised: 10/30/2017] [Accepted: 01/10/2018] [Indexed: 06/08/2023]
Abstract
BACKGROUND AND OBJECTIVE The application of medical knowledge strongly affects the performance of intelligent diagnosis, and method of learning the weights of medical knowledge plays a substantial role in probabilistic graphical models (PGMs). The purpose of this study is to investigate a discriminative weight-learning method based on a medical knowledge network (MKN). METHODS We propose a training model called the maximum margin medical knowledge network (M3KN), which is strictly derived for calculating the weight of medical knowledge. Using the definition of a reasonable margin, the weight learning can be transformed into a margin optimization problem. To solve the optimization problem, we adopt a sequential minimal optimization (SMO) algorithm and the clique property of a Markov network. Ultimately, M3KN not only incorporates the inference ability of PGMs but also deals with high-dimensional logic knowledge. RESULTS The experimental results indicate that M3KN obtains a higher F-measure score than the maximum likelihood learning algorithm of MKN for both Chinese Electronic Medical Records (CEMRs) and Blood Examination Records (BERs). Furthermore, the proposed approach is obviously superior to some classical machine learning algorithms for medical diagnosis. To adequately manifest the importance of domain knowledge, we numerically verify that the diagnostic accuracy of M3KN is gradually improved as the number of learned CEMRs increase, which contain important medical knowledge. CONCLUSIONS Our experimental results show that the proposed method performs reliably for learning the weights of medical knowledge. M3KN outperforms other existing methods by achieving an F-measure of 0.731 for CEMRs and 0.4538 for BERs. This further illustrates that M3KN can facilitate the investigations of intelligent healthcare.
Collapse
Affiliation(s)
- Jingchi Jiang
- School of Computer Science and Technology, Harbin Institute of Technology, Comprehensive Building 803 Harbin 150001, China.
| | - Jing Xie
- School of Computer Science and Technology, Harbin Institute of Technology, Comprehensive Building 803 Harbin 150001, China
| | - Chao Zhao
- School of Computer Science and Technology, Harbin Institute of Technology, Comprehensive Building 803 Harbin 150001, China
| | - Jia Su
- School of Computer Science and Technology, Harbin Institute of Technology, Comprehensive Building 803 Harbin 150001, China
| | - Yi Guan
- School of Computer Science and Technology, Harbin Institute of Technology, Comprehensive Building 803 Harbin 150001, China.
| | - Qiubin Yu
- Medical Record Room, The 2nd Affiliated Hospital of Harbin Medical University, Harbin 150086, China
| |
Collapse
|