1
|
Chen YJ, Lin IF, Chuang JH, Huang HL, Chan TC. Influenza vaccination is associated with a reduced risk of invasive aspergillosis in high-risk individuals in Taiwan: a population-based cohort study. Emerg Microbes Infect 2023; 12:2155584. [PMID: 36469743 PMCID: PMC9809410 DOI: 10.1080/22221751.2022.2155584] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Invasive aspergillosis (IA) has become the emerging life-threatening disease in recent years. Influenza has been identified as an independent risk factor for IA. Vaccination is the most effective way to prevent influenza, while whether it can reduce IA in high-risk population still uncertain. We aimed to investigate the association between influenza vaccination and the risk of IA in high-risk population. We performed a population-based cohort study of people who qualified for government-funded influenza vaccination and were at high risk for IA at the start of the influenza season each year between 2016 and 2019. We utilized Taiwan's National Health Insurance Research Database to identify the influenza vaccination status and IA diagnosis during the follow-up period. We compared the risk of IA between people with and without vaccination using multivariable logistic regression analysis. Out of total 8,544,451 people who were eligible during the 3 influenza seasons, 3,136,477 (36.7%) were vaccinated. A total of 1179 IA cases with the incidence of 13.8 cases per 100,000 high-risk individuals were identified during the follow-up. Compared to non-vaccinated group, vaccinated individuals had a 21% risk reduction of IA (adjusted odds ratio 0.79, 95% confidence interval 0.70-0.90). Influenza vaccination was associated with a lower risk of IA among males, immunosuppressive conditions, malignancy, diabetes, and those having host factors according to the European Organization for Research and Treatment of Cancer and the Mycoses Study Group Education and Research Consortium. Influenza vaccination is recommended for high-risk population to reduce the risk of IA.
Collapse
Affiliation(s)
- Yi-Jyun Chen
- Institute of Public Health, College of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - I-Feng Lin
- Institute of Public Health, College of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Jen-Hsiang Chuang
- Institute of Public Health, College of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan,Centers for Disease Control, Taipei, Taiwan
| | - Hung-Ling Huang
- Department of Internal Medicine, Kaohsiung Municipal Ta-Tung Hospital, Kaohsiung Medical University Hospital, Kaohsiung, Taiwan,Division of Pulmonary and Critical Care Medicine, Kaohsiung Medical University Hospital, Kaohsiung, Taiwan,Department of Internal Medicine, Kaohsiung Medical University Hospital, Kaohsiung, Taiwan,Graduate Institute of Medicine, College of Medicine, Kaohsiung Medical University, Kaohsiung, Taiwan
| | - Ta-Chien Chan
- Institute of Public Health, College of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan,Research Center for Humanities and Social Sciences, Academia Sinica, Taipei, Taiwan, Ta-Chien Chan Research Center for Humanities and Social Sciences, Academia Sinica, 128 Academia Road, Section 2, Taipei115, Taiwan
| |
Collapse
|
2
|
Singh VK, Shrivastava U, Bouayad L, Padmanabhan B, Ialynytchev A, Schultz SK. Machine learning for psychiatric patient triaging: an investigation of cascading classifiers. J Am Med Inform Assoc 2019; 25:1481-1487. [PMID: 30380082 DOI: 10.1093/jamia/ocy109] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2018] [Accepted: 07/26/2018] [Indexed: 11/12/2022] Open
Abstract
Objective Develop an approach, One-class-at-a-time, for triaging psychiatric patients using machine learning on textual patient records. Our approach aims to automate the triaging process and reduce expert effort while providing high classification reliability. Materials and Methods The One-class-at-a-time approach is a multistage cascading classification technique that achieves higher triage classification accuracy compared to traditional multiclass classifiers through 1) classifying one class at a time (or stage), and 2) identification and application of the highest accuracy classifier at each stage. The approach was evaluated using a unique dataset of 433 psychiatric patient records with a triage class label provided by "I2B2 challenge," a recent competition in the medical informatics community. Results The One-class-at-a-time cascading classifier outperformed state-of-the-art classification techniques with overall classification accuracy of 77% among 4 classes, exceeding accuracies of existing multiclass classifiers. The approach also enabled highly accurate classification of individual classes-the severe and mild with 85% accuracy, moderate with 64% accuracy, and absent with 60% accuracy. Discussion The triaging of psychiatric cases is a challenging problem due to the lack of clear guidelines and protocols. Our work presents a machine learning approach using psychiatric records for triaging patients based on their severity condition. Conclusion The One-class-at-a-time cascading classifier can be used as a decision aid to reduce triaging effort of physicians and nurses, while providing a unique opportunity to involve experts at each stage to reduce false positive and further improve the system's accuracy.
Collapse
Affiliation(s)
- Vivek Kumar Singh
- Information Systems and Decision Sciences, MUMA College of Business, University of South Florida, Tampa, Florida, USA
| | - Utkarsh Shrivastava
- Haworth College of Business, Department of Business Information Systems, Western Michigan University, Kalamazoo, Michigan, USA
| | - Lina Bouayad
- College of Business, Information Systems and Business Analytics, Florida International University, Miami, Florida, USA.,HSR&D Center of Innovation on Disability and Rehabilitation Research (CINDRR), James A. Haley Veterans Hospital, Tampa, Florida, USA
| | - Balaji Padmanabhan
- Information Systems and Decision Sciences, MUMA College of Business, University of South Florida, Tampa, Florida, USA
| | - Anna Ialynytchev
- HSR&D Center of Innovation on Disability and Rehabilitation Research (CINDRR), James A. Haley Veterans Hospital, Tampa, Florida, USA
| | - Susan K Schultz
- James A. Haley Veterans Hospital, Geriatric Psychiatry, Tampa, Florida, USA.,Department of Psychiatry and Behavioral Neurosciences, Morsani College of Medicine, University of South Florida, Tampa, Florida, USA
| |
Collapse
|
3
|
Yang CY, Chen RJ, Chou WL, Lee YJ, Lo YS. An Integrated Influenza Surveillance Framework Based on National Influenza-Like Illness Incidence and Multiple Hospital Electronic Medical Records for Early Prediction of Influenza Epidemics: Design and Evaluation. J Med Internet Res 2019; 21:e12341. [PMID: 30707099 PMCID: PMC6376337 DOI: 10.2196/12341] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2018] [Revised: 12/18/2018] [Accepted: 01/20/2019] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND Influenza is a leading cause of death worldwide and contributes to heavy economic losses to individuals and communities. Therefore, the early prediction of and interventions against influenza epidemics are crucial to reduce mortality and morbidity because of this disease. Similar to other countries, the Taiwan Centers for Disease Control and Prevention (TWCDC) has implemented influenza surveillance and reporting systems, which primarily rely on influenza-like illness (ILI) data reported by health care providers, for the early prediction of influenza epidemics. However, these surveillance and reporting systems show at least a 2-week delay in prediction, indicating the need for improvement. OBJECTIVE We aimed to integrate the TWCDC ILI data with electronic medical records (EMRs) of multiple hospitals in Taiwan. Our ultimate goal was to develop a national influenza trend prediction and reporting tool more accurate and efficient than the current influenza surveillance and reporting systems. METHODS First, the influenza expertise team at Taipei Medical University Health Care System (TMUHcS) identified surveillance variables relevant to the prediction of influenza epidemics. Second, we developed a framework for integrating the EMRs of multiple hospitals with the ILI data from the TWCDC website to proactively provide results of influenza epidemic monitoring to hospital infection control practitioners. Third, using the TWCDC ILI data as the gold standard for influenza reporting, we calculated Pearson correlation coefficients to measure the strength of the linear relationship between TMUHcS EMRs and regional and national TWCDC ILI data for 2 weekly time series datasets. Finally, we used the Moving Epidemic Method analyses to evaluate each surveillance variable for its predictive power for influenza epidemics. RESULTS Using this framework, we collected the EMRs and TWCDC ILI data of the past 3 influenza seasons (October 2014 to September 2017). On the basis of the EMRs of multiple hospitals, 3 surveillance variables, TMUHcS-ILI, TMUHcS-rapid influenza laboratory tests with positive results (RITP), and TMUHcS-influenza medication use (IMU), which reflected patients with ILI, those with positive results from rapid influenza diagnostic tests, and those treated with antiviral drugs, respectively, showed strong correlations with the TWCDC regional and national ILI data (r=.86-.98). The 2 surveillance variables-TMUHcS-RITP and TMUHcS-IMU-showed predictive power for influenza epidemics 3 to 4 weeks before the increase noted in the TWCDC ILI reports. CONCLUSIONS Our framework periodically integrated and compared surveillance data from multiple hospitals and the TWCDC website to maintain a certain prediction quality and proactively provide monitored results. Our results can be extended to other infectious diseases, mitigating the time and effort required for data collection and analysis. Furthermore, this approach may be developed as a cost-effective electronic surveillance tool for the early and accurate prediction of epidemics of influenza and other infectious diseases in densely populated regions and nations.
Collapse
Affiliation(s)
- Cheng-Yi Yang
- Graduate Institute of Biomedical Informatics, Taipei Medical University, Taipei, Taiwan
| | - Ray-Jade Chen
- Department of Surgery, School of Medicine, College of Medicine, Taipei Medical University, Taipei, Taiwan.,Taipei Medical University Hospital, Taipei, Taiwan
| | - Wan-Lin Chou
- Taipei Medical University Hospital, Taipei, Taiwan
| | - Yuarn-Jang Lee
- Division of Infectious Disease, Department of Internal Medicine, Taipei Medical University Hospital, Taipei, Taiwan
| | - Yu-Sheng Lo
- Graduate Institute of Biomedical Informatics, Taipei Medical University, Taipei, Taiwan
| |
Collapse
|
4
|
Global Research on Syndromic Surveillance from 1993 to 2017: Bibliometric Analysis and Visualization. SUSTAINABILITY 2018. [DOI: 10.3390/su10103414] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
Abstract
Syndromic Surveillance aims at analyzing medical data to detect clusters of illness or forecast disease outbreaks. Although the research in this field is flourishing in terms of publications, an insight of the global research output has been overlooked. This paper aims at analyzing the global scientific output of the research from 1993 to 2017. To this end, the paper uses bibliometric analysis and visualization to achieve its goal. Particularly, a data processing framework was proposed based on citation datasets collected from Scopus and Clarivate Analytics’ Web of Science Core Collection (WoSCC). The bibliometric method and Citespace were used to analyze the institutions, countries, and research areas as well as the current hotspots and trends. The preprocessed dataset includes 14,680 citation records. The analysis uncovered USA, England, Canada, France and Australia as the top five most productive countries publishing about Syndromic Surveillance. On the other hand, at the Pinnacle of academic institutions are the US Centers for Disease Control and Prevention (CDC). The reference co-citation analysis uncovered the common research venues and further analysis of the keyword cooccurrence revealed the most trending topics. The findings of this research will help in enriching the field with a comprehensive view of the status and future trends of the research on Syndromic Surveillance.
Collapse
|
5
|
Rochefort CM, Buckeridge DL, Tanguay A, Biron A, D'Aragon F, Wang S, Gallix B, Valiquette L, Audet LA, Lee TC, Jayaraman D, Petrucci B, Lefebvre P. Accuracy and generalizability of using automated methods for identifying adverse events from electronic health record data: a validation study protocol. BMC Health Serv Res 2017; 17:147. [PMID: 28209197 PMCID: PMC5314632 DOI: 10.1186/s12913-017-2069-7] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2016] [Accepted: 02/02/2017] [Indexed: 12/31/2022] Open
Abstract
Background Adverse events (AEs) in acute care hospitals are frequent and associated with significant morbidity, mortality, and costs. Measuring AEs is necessary for quality improvement and benchmarking purposes, but current detection methods lack in accuracy, efficiency, and generalizability. The growing availability of electronic health records (EHR) and the development of natural language processing techniques for encoding narrative data offer an opportunity to develop potentially better methods. The purpose of this study is to determine the accuracy and generalizability of using automated methods for detecting three high-incidence and high-impact AEs from EHR data: a) hospital-acquired pneumonia, b) ventilator-associated event and, c) central line-associated bloodstream infection. Methods This validation study will be conducted among medical, surgical and ICU patients admitted between 2013 and 2016 to the Centre hospitalier universitaire de Sherbrooke (CHUS) and the McGill University Health Centre (MUHC), which has both French and English sites. A random 60% sample of CHUS patients will be used for model development purposes (cohort 1, development set). Using a random sample of these patients, a reference standard assessment of their medical chart will be performed. Multivariate logistic regression and the area under the curve (AUC) will be employed to iteratively develop and optimize three automated AE detection models (i.e., one per AE of interest) using EHR data from the CHUS. These models will then be validated on a random sample of the remaining 40% of CHUS patients (cohort 1, internal validation set) using chart review to assess accuracy. The most accurate models developed and validated at the CHUS will then be applied to EHR data from a random sample of patients admitted to the MUHC French site (cohort 2) and English site (cohort 3)—a critical requirement given the use of narrative data –, and accuracy will be assessed using chart review. Generalizability will be determined by comparing AUCs from cohorts 2 and 3 to those from cohort 1. Discussion This study will likely produce more accurate and efficient measures of AEs. These measures could be used to assess the incidence rates of AEs, evaluate the success of preventive interventions, or benchmark performance across hospitals.
Collapse
Affiliation(s)
- Christian M Rochefort
- School of Nursing, Faculty of Medicine and Health Sciences, University of Sherbrooke, 3001, 12e Avenue Nord, Sherbrooke, QC, J1H 5N4, Canada. .,Centre de recherche de l'Hôpital Charles-LeMoyne, University of Sherbrooke-Campus Longueuil, 150 Place Charles-LeMoyne, Longueuil, QC, J4K 0A8, Canada. .,Department of Epidemiology, Biostatics and Occupational Health, Faculty of Medicine, McGill University, Purvis Hall, 1020 Pine Avenue West, Montreal, QC, H3A 1A2, Canada.
| | - David L Buckeridge
- Department of Epidemiology, Biostatics and Occupational Health, Faculty of Medicine, McGill University, Purvis Hall, 1020 Pine Avenue West, Montreal, QC, H3A 1A2, Canada
| | - Andréanne Tanguay
- School of Nursing, Faculty of Medicine and Health Sciences, University of Sherbrooke, 3001, 12e Avenue Nord, Sherbrooke, QC, J1H 5N4, Canada
| | - Alain Biron
- Department of Quality, Patient Safety and Performance, McGill University Health Centre, 2155 Guy Street, Montreal, QC, H3H 2R9, Canada.,Ingram School of Nursing, McGill University, Wilson Hall, 3506 University Street, Montreal, QC, H3A 2A7, Canada
| | - Frédérick D'Aragon
- Department of Anesthesiology, Faculty of Medicine and Health Sciences, University of Sherbrooke and Centre hospitalier universitaire de Sherbrooke, 3001, 12e Avenue Nord, Sherbrooke, QC, J1H 5N4, Canada
| | - Shengrui Wang
- Faculty of Sciences, Department of Informatics, University of Sherbrooke, 2500 Boulevard de l'Université, Sherbrooke, QC, J1K 2R1, Canada
| | - Benoit Gallix
- Department of Diagnostic Radiology, McGill University and McGill University Health Centre, 1650 Cedar Avenue, Montreal, QC, H3G 1A4, Canada
| | - Louis Valiquette
- Department of Microbiology and Infectious Diseases, University of Sherbrooke and Centre hospitalier universitaire de Sherbrooke, 3001, 12e Avenue Nord, Sherbrooke, QC, J1H 5N4, Canada
| | - Li-Anne Audet
- School of Nursing, Faculty of Medicine and Health Sciences, University of Sherbrooke, 3001, 12e Avenue Nord, Sherbrooke, QC, J1H 5N4, Canada
| | - Todd C Lee
- Department of Internal Medicine, McGill University and McGill University Health Centre, 1650 Cedar Avenue, Montreal, QC, H3G 1A4, Canada
| | - Dev Jayaraman
- Department of Internal Medicine, McGill University and McGill University Health Centre, 1650 Cedar Avenue, Montreal, QC, H3G 1A4, Canada
| | - Bruno Petrucci
- Department of Quality, Evaluation, Performance and Ethics, Centre hospitalier universitaire de Sherbrooke, 3001, 12e Avenue Nord, Sherbrooke, QC, J1H 5N4, Canada
| | - Patricia Lefebvre
- Department of Quality, Patient Safety and Performance, McGill University Health Centre, 2155 Guy Street, Montreal, QC, H3H 2R9, Canada
| |
Collapse
|
6
|
Cadieux G, Tamblyn R, Buckeridge DL, Dendukuri N. Validation of Diagnostic Groups Based on Health Care Utilization Data Should Adjust for Sampling Strategy. Med Care 2015; 55:e59-e67. [PMID: 25821898 PMCID: PMC5510703 DOI: 10.1097/mlr.0000000000000324] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Supplemental Digital Content is available in the text. Objective: Valid measurement of outcomes such as disease prevalence using health care utilization data is fundamental to the implementation of a “learning health system.” Definitions of such outcomes can be complex, based on multiple diagnostic codes. The literature on validating such data demonstrates a lack of awareness of the need for a stratified sampling design and corresponding statistical methods. We propose a method for validating the measurement of diagnostic groups that have: (1) different prevalences of diagnostic codes within the group; and (2) low prevalence. Methods: We describe an estimation method whereby: (1) low-prevalence diagnostic codes are oversampled, and the positive predictive value (PPV) of the diagnostic group is estimated as a weighted average of the PPV of each diagnostic code; and (2) claims that fall within a low-prevalence diagnostic group are oversampled relative to claims that are not, and bias-adjusted estimators of sensitivity and specificity are generated. Application: We illustrate our proposed method using an example from population health surveillance in which diagnostic groups are applied to physician claims to identify cases of acute respiratory illness. Conclusions: Failure to account for the prevalence of each diagnostic code within a diagnostic group leads to the underestimation of the PPV, because low-prevalence diagnostic codes are more likely to be false positives. Failure to adjust for oversampling of claims that fall within the low-prevalence diagnostic group relative to those that do not leads to the overestimation of sensitivity and underestimation of specificity.
Collapse
Affiliation(s)
- Geneviève Cadieux
- *Dalla Lana School of Public Health, University of Toronto, Toronto, ON †Department of Epidemiology, Biostatistics and Occupational Health, McGill University ‡Direction de la Santé Publique de Montréal §Department of Medicine, McGill University, Montreal, QC, Canada
| | | | | | | |
Collapse
|
7
|
Mowery DL, Jordan P, Wiebe J, Harkema H, Dowling J, Chapman WW. Semantic annotation of clinical events for generating a problem list. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2013; 2013:1032-41. [PMID: 24551392 PMCID: PMC3900128] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
We present a pilot study of an annotation schema representing problems and their attributes, along with their relationship to temporal modifiers. We evaluated the ability for humans to annotate clinical reports using the schema and assessed the contribution of semantic annotations in determining the status of a problem mention as active, inactive, proposed, resolved, negated, or other. Our hypothesis is that the schema captures semantic information useful for generating an accurate problem list. Clinical named entities such as reference events, time points, time durations, aspectual phase, ordering words and their relationships including modifications and ordering relations can be annotated by humans with low to moderate recall. Once identified, most attributes can be annotated with low to moderate agreement. Some attributes - Experiencer, Existence, and Certainty - are more informative than other attributes - Intermittency and Generalized/Conditional - for predicting a problem mention's status. Support vector machine outperformed Naïve Bayes and Decision Tree for predicting a problem's status.
Collapse
|
8
|
Conway M, Dowling JN, Chapman WW. Using chief complaints for syndromic surveillance: a review of chief complaint based classifiers in North America. J Biomed Inform 2013; 46:734-43. [PMID: 23602781 DOI: 10.1016/j.jbi.2013.04.003] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2012] [Revised: 08/30/2012] [Accepted: 04/03/2013] [Indexed: 11/27/2022]
Abstract
A major goal of Natural Language Processing in the public health informatics domain is the automatic extraction and encoding of data stored in free text patient records. This extracted data can then be utilized by computerized systems to perform syndromic surveillance. In particular, the chief complaint--a short string that describes a patient's symptoms--has come to be a vital resource for syndromic surveillance in the North American context due to its near ubiquity. This paper reviews fifteen systems in North America--at the city, county, state and federal level--that use chief complaints for syndromic surveillance.
Collapse
Affiliation(s)
- Mike Conway
- Division of Biomedical Informatics, University of California, San Diego, 9500 Gilman Dr. MC 0505 La Jolla, California 92093, USA.
| | | | | |
Collapse
|
9
|
Tan L, Zhang J, Cheng L, Yan W, Diwan VK, Long L, Nie S. Selecting Targeted Symptoms/Syndromes for Syndromic Surveillance in Rural China. Online J Public Health Inform 2013. [PMCID: PMC3692788] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open
Abstract
Objective To select the potential targeted symptoms/syndromes as early warning indicators for epidemics or outbreaks detection in rural China. Introduction Patients’ chief complaints (CCs) as a common data source, has been widely used in syndromic surveillance due to its timeliness, accuracy and availability (1). For automated syndromic surveillance, CCs always classified into predefined syndromic categories to facilitate subsequent data aggregation and analysis. However, in rural China, most outpatient doctors recorded the information of patients (e.g. CCs) into clinic logs manually rather than computers. Thus, more convenient surveillance method is needed in the syndromic surveillance project (ISSC). And the first and important thing is to select the targeted symptoms/syndromes. Methods Epidemiological analysis was conducted on data from case report system in Jingmen City (one study site in ISSC) from 2004 to 2009. Initial symptoms/syndromes were selected by literature reviews. And finally expert consultation meetings, workshops and field investigation were held to confirm the targeted symptoms/syndromes. Results 10 kinds of infectious diseases, 6 categories of emergencies, and 4 bioterrorism events (i.e. plague, anthrax, botulism and hemorrhagic fever) were chose as specific diseases/events for monitoring (Table 1). Two surveillance schemes were developed by reviewing on 565 literatures about clinical conditions of specific diseases/events and 14 literatures about CCs based syndromic surveillance. The former one was to monitor symptoms (19 initial symptoms), and then aggregation or analysis on single or combined symptom(s); and the other one was to monitor syndromes (9 initial syndromes) directly (Table 2). The consultation meeting and field investigation identified three issues which should be considered: 1) the abilities of doctors especially village doctors to understand the definitions of symptoms/syndromes; 2) the workload of data collection; 3) the sensitive and specific of each symptom/syndrome. Finally, Scheme 1 was used and 10 targeted symptoms were determined (Table 2). Conclusions We should take the simple, stability and feasibility of operation, and also the local conditions into account before establishing a surveillance system. Symptoms were more suitable for monitoring compared to syndromes in resource-poor settings. Further evaluated and validated would be conducted during implementation. Our study might provide methods and evidences for other developing countries with limited conditions in using automated syndromic surveillance system, to construct similar early warning system.
Collapse
Affiliation(s)
- Li Tan
- Tongji Medical College, Wuhan City, China
| | - Jie Zhang
- Tongji Medical College, Wuhan City, China
| | | | - Weirong Yan
- Tongji Medical College, Wuhan City, China;,Karolinska Institutet, Stockholm, Sweden
| | | | - Lu Long
- Tongji Medical College, Wuhan City, China
| | - Shaofa Nie
- Tongji Medical College, Wuhan City, China;,Shaofa Nie, E-mail:
| |
Collapse
|
10
|
Lo YC, Chuang JH, Kuo HW, Huang WT, Hsu YF, Liu MT, Chen CH, Huang HH, Chang CH, Chou JH, Chang FY, Lin TY, Chiu WT. Surveillance and vaccine effectiveness of an influenza epidemic predominated by vaccine-mismatched influenza B/Yamagata-lineage viruses in Taiwan, 2011-12 season. PLoS One 2013; 8:e58222. [PMID: 23472161 PMCID: PMC3589334 DOI: 10.1371/journal.pone.0058222] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2012] [Accepted: 02/01/2013] [Indexed: 01/12/2023] Open
Abstract
Introduction The 2011−12 trivalent influenza vaccine contains a strain of influenza B/Victoria-lineage viruses. Despite free provision of influenza vaccine among target populations, an epidemic predominated by influenza B/Yamagata-lineage viruses occurred during the 2011−12 season in Taiwan. We characterized this vaccine-mismatched epidemic and estimated influenza vaccine effectiveness (VE). Methods Influenza activity was monitored through sentinel viral surveillance, emergency department (ED) and outpatient influenza-like illness (ILI) syndromic surveillance, and case-based surveillance of influenza with complications and deaths. VE against laboratory-confirmed influenza was evaluated through a case-control study on ILI patients enrolled into sentinel viral surveillance. Logistic regression was used to estimate VE adjusted for confounding factors. Results During July 2011−June 2012, influenza B accounted for 2,382 (72.5%) of 3,285 influenza-positive respiratory specimens. Of 329 influenza B viral isolates with antigen characterization, 287 (87.2%) were B/Yamagata-lineage viruses. Proportions of ED and outpatient visits being ILI-related increased from November 2011 to January 2012. Of 1,704 confirmed cases of influenza with complications, including 154 (9.0%) deaths, influenza B accounted for 1,034 (60.7%) of the confirmed cases and 103 (66.9%) of the deaths. Reporting rates of confirmed influenza with complications and deaths were 73.5 and 6.6 per 1,000,000, respectively, highest among those aged ≥65 years, 50−64 years, 3−6 years, and 0−2 years. Adjusted VE was −31% (95% CI: −80, 4) against all influenza, 54% (95% CI: 3, 78) against influenza A, and −66% (95% CI: −132, −18) against influenza B. Conclusions This influenza epidemic in Taiwan was predominated by B/Yamagata-lineage viruses unprotected by the 2011−12 trivalent vaccine. The morbidity and mortality of this vaccine-mismatched epidemic warrants careful consideration of introducing a quadrivalent influenza vaccine that includes strains of both B lineages.
Collapse
Affiliation(s)
- Yi-Chun Lo
- Centers for Disease Control, Taipei, Taiwan, Republic of China
| | | | - Hung-Wei Kuo
- Centers for Disease Control, Taipei, Taiwan, Republic of China
| | - Wan-Ting Huang
- Centers for Disease Control, Taipei, Taiwan, Republic of China
| | - Yu-Fen Hsu
- Centers for Disease Control, Taipei, Taiwan, Republic of China
| | - Ming-Tsan Liu
- Centers for Disease Control, Taipei, Taiwan, Republic of China
| | - Chang-Hsun Chen
- Centers for Disease Control, Taipei, Taiwan, Republic of China
| | - Hui-Hsun Huang
- Centers for Disease Control, Taipei, Taiwan, Republic of China
| | - Chi-Hsi Chang
- Centers for Disease Control, Taipei, Taiwan, Republic of China
| | - Jih-Haw Chou
- Centers for Disease Control, Taipei, Taiwan, Republic of China
| | - Feng-Yee Chang
- Centers for Disease Control, Taipei, Taiwan, Republic of China
- Department of Internal Medicine and Graduate Institute of Medical Sciences, National Defense Medical Center, Taipei, Taiwan, Republic of China
- * E-mail:
| | - Tzou-Yien Lin
- Department of Health, Taipei, Taiwan, Republic of China
| | - Wen-Ta Chiu
- Department of Health, Taipei, Taiwan, Republic of China
| |
Collapse
|
11
|
Mowery D, Wiebe J, Visweswaran S, Harkema H, Chapman WW. Building an automated SOAP classifier for emergency department reports. J Biomed Inform 2012; 45:71-81. [PMID: 21925286 PMCID: PMC3267853 DOI: 10.1016/j.jbi.2011.08.020] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2011] [Revised: 08/30/2011] [Accepted: 08/31/2011] [Indexed: 10/17/2022]
Abstract
Information extraction applications that extract structured event and entity information from unstructured text can leverage knowledge of clinical report structure to improve performance. The Subjective, Objective, Assessment, Plan (SOAP) framework, used to structure progress notes to facilitate problem-specific, clinical decision making by physicians, is one example of a well-known, canonical structure in the medical domain. Although its applicability to structuring data is understood, its contribution to information extraction tasks has not yet been determined. The first step to evaluating the SOAP framework's usefulness for clinical information extraction is to apply the model to clinical narratives and develop an automated SOAP classifier that classifies sentences from clinical reports. In this quantitative study, we applied the SOAP framework to sentences from emergency department reports, and trained and evaluated SOAP classifiers built with various linguistic features. We found the SOAP framework can be applied manually to emergency department reports with high agreement (Cohen's kappa coefficients over 0.70). Using a variety of features, we found classifiers for each SOAP class can be created with moderate to outstanding performance with F(1) scores of 93.9 (subjective), 94.5 (objective), 75.7 (assessment), and 77.0 (plan). We look forward to expanding the framework and applying the SOAP classification to clinical information extraction tasks.
Collapse
Affiliation(s)
- Danielle Mowery
- Department of Biomedical Informatics, University of Pittsburgh, Parkvale Building M-183, 200 Meyran Avenue, Pittsburgh, PA 15260, USA.
| | | | | | | | | |
Collapse
|
12
|
Yu AC, Cimino JJ. A comparison of two methods for retrieving ICD-9-CM data: the effect of using an ontology-based method for handling terminology changes. J Biomed Inform 2011; 44:289-98. [PMID: 21262390 PMCID: PMC3440000 DOI: 10.1016/j.jbi.2011.01.005] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2010] [Revised: 12/13/2010] [Accepted: 01/14/2011] [Indexed: 11/05/2022]
Abstract
Objective Most existing controlled terminologies can be characterized as collections of terms, wherein the terms are arranged in a simple list or organized in a hierarchy. These kinds of terminologies are considered useful for standardizing terms and encoding data and are currently used in many existing information systems. However, they suffer from a number of limitations that make data reuse difficult. Relatively recently, it has been proposed that formal ontological methods can be applied to some of the problems of terminological design. Biomedical ontologies organize concepts (embodiments of knowledge about biomedical reality) whereas terminologies organize terms (what is used to code patient data at a certain point in time, based on the particular terminology version). However, the application of these methods to existing terminologies is not straightforward. The use of these terminologies is firmly entrenched in many systems, and what might seem to be a simple option of replacing these terminologies is not possible. Moreover, these terminologies evolve over time in order to suit the needs of users. Any methodology must therefore take these constraints into consideration, hence the need for formal methods of managing changes. Along these lines, we have developed a formal representation of the concept-term relation, around which we have also developed a methodology for management of terminology changes. The objective of this study was to determine whether our methodology would result in improved retrieval of data. Design Comparison of two methods for retrieving data encoded with terms from the International Classification of Diseases (ICD-9-CM), based on their recall when retrieving data for ICD-9-CM terms whose codes had changed but which had retained their original meaning (code change). Measurements Recall and interclass correlation coefficient. Results Statistically significant differences were detected (p < 0.05) with the McNemar test for two terms whose codes had changed. Furthermore, when all the cases are combined in an overall category, our method also performs statistically significantly better (p < 0.05). Conclusion Our study shows that an ontology-based ICD-9-CM data retrieval method that takes into account the effects of terminology changes performs better on recall than one that does not in the retrieval of data for terms whose codes had changed but which retained their original meaning.
Collapse
Affiliation(s)
- Alexander C Yu
- Department of Biomedical Informatics, Columbia University, New York, NY, USA.
| | | |
Collapse
|
13
|
Cadieux G, Buckeridge DL, Jacques A, Libman M, Dendukuri N, Tamblyn R. Accuracy of syndrome definitions based on diagnoses in physician claims. BMC Public Health 2011; 11:17. [PMID: 21211054 PMCID: PMC3025839 DOI: 10.1186/1471-2458-11-17] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2010] [Accepted: 01/07/2011] [Indexed: 11/10/2022] Open
Abstract
Background Community clinics offer potential for timelier outbreak detection and monitoring than emergency departments. However, the accuracy of syndrome definitions used in surveillance has never been evaluated in community settings. This study's objective was to assess the accuracy of syndrome definitions based on diagnostic codes in physician claims for identifying 5 syndromes (fever, gastrointestinal, neurological, rash, and respiratory including influenza-like illness) in community clinics. Methods We selected a random sample of 3,600 community-based primary care physicians who practiced in the fee-for-service system in the province of Quebec, Canada in 2005-2007. We randomly selected 10 visits per physician from their claims, stratifying on syndrome type and presence, diagnosis, and month. Double-blinded chart reviews were conducted by telephone with consenting physicians to obtain information on patient diagnoses for each sampled visit. The sensitivity, specificity, and positive predictive value (PPV) of physician claims were estimated by comparison to chart review. Results 1,098 (30.5%) physicians completed the chart review. A chart entry on the date of the corresponding claim was found for 10,529 (95.9%) visits. The sensitivity of syndrome definitions based on diagnostic codes in physician claims was low, ranging from 0.11 (fever) to 0.44 (respiratory), the specificity was high, and the PPV was moderate to high, ranging from 0.59 (fever) to 0.85 (respiratory). We found that rarely used diagnostic codes had a higher probability of being false-positives, and that more commonly used diagnostic codes had a higher PPV. Conclusions Future research should identify physician, patient, and encounter characteristics associated with the accuracy of diagnostic codes in physician claims. This would enable public health to improve syndromic surveillance, either by focusing on physician claims whose diagnostic code is more likely to be accurate, or by using all physician claims and weighing each according to the likelihood that its diagnostic code is accurate.
Collapse
Affiliation(s)
- Geneviève Cadieux
- Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montreal, Canada.
| | | | | | | | | | | |
Collapse
|
14
|
Campbell EM, Sittig DF, Chapman WW, Hazlehurst BL, Cohen AM. Understanding Inter-rater Disagreement: A Mixed Methods Approach. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2010; 2010:81-5. [PMID: 21346945 PMCID: PMC3041314] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
In an experiment to investigate cognitive skill differences between clinicians and lay persons, eight individuals in each group were asked to determine if an explicit concept existed in an ambulatory encounter note (a simple task) or if the concept could be inferred from the same note (a complex task). Subjects answered questions, highlighted text used to answer each question, and commented on their reasoning for selecting specific text. Quantitative results were mixed for expert vs. non-expert task performance on simple vs. complex tasks. Qualitative analysis revealed that data ambiguity obscured quantifiable skill differences between groups. In addition, this analysis offered new insight into whether a concept identification task is simple or complex. We present this case study to demonstrate the value of mixed method approaches to task-based performance study design and evaluation. We discuss the results in terms of their implications for evaluating meaningful use of technologies.
Collapse
|
15
|
DeLisle S, South B, Anthony JA, Kalp E, Gundlapallli A, Curriero FC, Glass GE, Samore M, Perl TM. Combining free text and structured electronic medical record entries to detect acute respiratory infections. PLoS One 2010; 5:e13377. [PMID: 20976281 PMCID: PMC2954790 DOI: 10.1371/journal.pone.0013377] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2010] [Accepted: 08/30/2010] [Indexed: 11/25/2022] Open
Abstract
BACKGROUND The electronic medical record (EMR) contains a rich source of information that could be harnessed for epidemic surveillance. We asked if structured EMR data could be coupled with computerized processing of free-text clinical entries to enhance detection of acute respiratory infections (ARI). METHODOLOGY A manual review of EMR records related to 15,377 outpatient visits uncovered 280 reference cases of ARI. We used logistic regression with backward elimination to determine which among candidate structured EMR parameters (diagnostic codes, vital signs and orders for tests, imaging and medications) contributed to the detection of those reference cases. We also developed a computerized free-text search to identify clinical notes documenting at least two non-negated ARI symptoms. We then used heuristics to build case-detection algorithms that best combined the retained structured EMR parameters with the results of the text analysis. PRINCIPAL FINDINGS An adjusted grouping of diagnostic codes identified reference ARI patients with a sensitivity of 79%, a specificity of 96% and a positive predictive value (PPV) of 32%. Of the 21 additional structured clinical parameters considered, two contributed significantly to ARI detection: new prescriptions for cough remedies and elevations in body temperature to at least 38°C. Together with the diagnostic codes, these parameters increased detection sensitivity to 87%, but specificity and PPV declined to 95% and 25%, respectively. Adding text analysis increased sensitivity to 99%, but PPV dropped further to 14%. Algorithms that required satisfying both a query of structured EMR parameters as well as text analysis disclosed PPVs of 52-68% and retained sensitivities of 69-73%. CONCLUSION Structured EMR parameters and free-text analyses can be combined into algorithms that can detect ARI cases with new levels of sensitivity or precision. These results highlight potential paths by which repurposed EMR information could facilitate the discovery of epidemics before they cause mass casualties.
Collapse
Affiliation(s)
- Sylvain DeLisle
- Veterans Affairs Maryland Health Care System, Baltimore, Maryland, United States of America.
| | | | | | | | | | | | | | | | | |
Collapse
|
16
|
Sarkar IN. Biomedical informatics and translational medicine. J Transl Med 2010; 8:22. [PMID: 20187952 PMCID: PMC2837642 DOI: 10.1186/1479-5876-8-22] [Citation(s) in RCA: 73] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2009] [Accepted: 02/26/2010] [Indexed: 11/23/2022] Open
Abstract
Biomedical informatics involves a core set of methodologies that can provide a foundation for crossing the "translational barriers" associated with translational medicine. To this end, the fundamental aspects of biomedical informatics (e.g., bioinformatics, imaging informatics, clinical informatics, and public health informatics) may be essential in helping improve the ability to bring basic research findings to the bedside, evaluate the efficacy of interventions across communities, and enable the assessment of the eventual impact of translational medicine innovations on health policies. Here, a brief description is provided for a selection of key biomedical informatics topics (Decision Support, Natural Language Processing, Standards, Information Retrieval, and Electronic Health Records) and their relevance to translational medicine. Based on contributions and advancements in each of these topic areas, the article proposes that biomedical informatics practitioners ("biomedical informaticians") can be essential members of translational medicine teams.
Collapse
Affiliation(s)
- Indra Neil Sarkar
- Center for Clinical and Translational Science, Department of Microbiology and Molecular Genetics, University of Vermont, College of Medicine, 89 Beaumont Ave, Given Courtyard N309, Burlington, VT 05405, USA.
| |
Collapse
|
17
|
Lu HM, Chen H, Zeng D, King CC, Shih FY, Wu TS, Hsiao JY. Multilingual chief complaint classification for syndromic surveillance: an experiment with Chinese chief complaints. Int J Med Inform 2008; 78:308-20. [PMID: 18838292 PMCID: PMC7108263 DOI: 10.1016/j.ijmedinf.2008.08.004] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2008] [Revised: 08/18/2008] [Accepted: 08/19/2008] [Indexed: 11/30/2022]
Abstract
Purpose Syndromic surveillance is aimed at early detection of disease outbreaks. An important data source for syndromic surveillance is free-text chief complaints (CCs), which may be recorded in different languages. For automated syndromic surveillance, CCs must be classified into predefined syndromic categories to facilitate subsequent data aggregation and analysis. Despite the fact that syndromic surveillance is largely an international effort, existing CC classification systems do not provide adequate support for processing CCs recorded in non-English languages. This paper reports a multilingual CC classification effort, focusing on CCs recorded in Chinese. Methods We propose a novel Chinese CC classification system leveraging a Chinese-English translation module and an existing English CC classification approach. A set of 470 Chinese key phrases was extracted from about one million Chinese CC records using statistical methods. Based on the extracted key phrases, the system translates Chinese text into English and classifies the translated CCs to syndromic categories using an existing English CC classification system. Results Compared to alternative approaches using a bilingual dictionary and a general-purpose machine translation system, our approach performs significantly better in terms of positive predictive value (PPV or precision), sensitivity (recall), specificity, and F measure (the harmonic mean of PPV and sensitivity), based on a computational experiment using real-world CC records. Conclusions Our design provides satisfactory performance in classifying Chinese CCs into syndromic categories for public health surveillance. The overall design of our system also points out a potentially fruitful direction for multilingual CC systems that need to handle languages beyond English and Chinese.
Collapse
Affiliation(s)
- Hsin-Min Lu
- Management Information Systems Department, Eller College of Management, University of Arizona, 1130 East Helen Street, McClelland Hall 430, Tucson, Arizona 85721, USA.
| | | | | | | | | | | | | |
Collapse
|
18
|
Scholer MJ, Ghneim GS, Wu S, Westlake M, Travers DA, Waller AE, McCalla AL, Wetterhall SF. Defining and applying a method for improving the sensitivity and specificity of an emergency department early event detection system. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2007; 2007:651-5. [PMID: 18693917 PMCID: PMC2655810] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Received: 03/15/2007] [Revised: 07/20/2007] [Accepted: 10/11/2007] [Indexed: 05/26/2023]
Abstract
The sensitivity and specificity of syndrome definitions used in early event detection (EED) systems affect the usefulness of the system for end-users. The ability to calculate these values aids system designers in the refinement of syndrome definitions to better meet public health needs. Utilizing a stratified sampling method and expert review to create a gold standard dataset for the calculation of sensitivity and specificity, we describe how varying syndrome structure impacts these statistical parameters and discuss the relevance of this to outbreak detection and investigation.
Collapse
|
19
|
Yu AC, Cimino JJ. A comparison of two methods for retrieving ICD-9-CM data: The effect of using an ontology-based method for handling terminology changes. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2007; 2007:841-845. [PMID: 18693955 PMCID: PMC2655821] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Received: 03/15/2007] [Revised: 07/19/2007] [Accepted: 10/11/2007] [Indexed: 05/26/2023]
Abstract
Terminology changes may affect reusability of data, hence the need for methods for managing changes. Along these lines, we have developed a formal representation of the concept-term relationship, around which we have also developed a methodology for management of terminology changes. We have implemented our methodology in a terminology maintenance tool. To evaluate our methodology, we compared two methods for retrieving ICD-9-CM data, based on their recall when retrieving data for ICD-9-CM terms whose codes had changed but which had retained their original meaning. Our results show that recall is either the same or better with a retrieval method that takes into account the effects of terminology. Statistically significant differences were detected (p<0.05) with the McNemar test for two terms whose codes had changed. Furthermore, when all the cases are combined in an overall category, our method 2 also performs statistically significantly better than default method 1 (p < 0.05).
Collapse
Affiliation(s)
- Alexander C Yu
- Department of Biomedical Informatics, Columbia University, New York, NY, USA
| | | |
Collapse
|
20
|
Lu HM, Zeng D, Trujillo L, Komatsu K, Chen H. Ontology-enhanced automatic chief complaint classification for syndromic surveillance. J Biomed Inform 2007; 41:340-56. [PMID: 17928273 DOI: 10.1016/j.jbi.2007.08.009] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2007] [Revised: 07/27/2007] [Accepted: 08/11/2007] [Indexed: 11/17/2022]
Abstract
Emergency department free-text chief complaints (CCs) are a major data source for syndromic surveillance. CCs need to be classified into syndromic categories for subsequent automatic analysis. However, the lack of a standard vocabulary and high-quality encodings of CCs hinder effective classification. This paper presents a new ontology-enhanced automatic CC classification approach. Exploiting semantic relations in a medical ontology, this approach is motivated to address the CC vocabulary variation problem in general and to meet the specific need for a classification approach capable of handling multiple sets of syndromic categories. We report an experimental study comparing our approach with two popular CC classification methods using a real-world dataset. This study indicates that our ontology-enhanced approach performs significantly better than the benchmark methods in terms of sensitivity, F measure, and F2 measure.
Collapse
Affiliation(s)
- Hsin-Min Lu
- Management Information Systems Department, The Eller College of Management, University of Arizona, 1130 E. Helen Street, Room 430, P.O. Box 210108, Tucson, AZ 85721-0108, USA.
| | | | | | | | | |
Collapse
|
21
|
|
22
|
Chapman WW, Dowling JN. Inductive creation of an annotation schema for manually indexing clinical conditions from emergency department reports. J Biomed Inform 2005; 39:196-208. [PMID: 16230050 PMCID: PMC1440922 DOI: 10.1016/j.jbi.2005.06.004] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2005] [Revised: 06/20/2005] [Accepted: 06/27/2005] [Indexed: 11/18/2022]
Abstract
Evaluating automated indexing applications requires comparing automatically indexed terms against manual reference standard annotations. However, there are no standard guidelines for determining which words from a textual document to include in manual annotations, and the vague task can result in substantial variation among manual indexers. We applied grounded theory to emergency department reports to create an annotation schema representing syntactic and semantic variables that could be annotated when indexing clinical conditions. We describe the annotation schema, which includes variables representing medical concepts (e.g., symptom, demographics), linguistic form (e.g., noun, adjective), and modifier types (e.g., anatomic location, severity). We measured the schema's quality and found: (1) the schema was comprehensive enough to be applied to 20 unseen reports without changes to the schema; (2) agreement between author annotators applying the schema was high, with an F measure of 93%; and (3) the authors made complementary errors when applying the schema, demonstrating that the schema incorporates both linguistic and medical expertise.
Collapse
Affiliation(s)
- Wendy W Chapman
- Center for Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA.
| | | |
Collapse
|