1
|
Lighterness A, Adcock M, Scanlon LA, Price G. Data Quality-Driven Improvement in Health Care: Systematic Literature Review. J Med Internet Res 2024; 26:e57615. [PMID: 39173155 PMCID: PMC11377907 DOI: 10.2196/57615] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Revised: 05/10/2024] [Accepted: 05/30/2024] [Indexed: 08/24/2024] Open
Abstract
BACKGROUND The promise of real-world evidence and the learning health care system primarily depends on access to high-quality data. Despite widespread awareness of the prevalence and potential impacts of poor data quality (DQ), best practices for its assessment and improvement are unknown. OBJECTIVE This review aims to investigate how existing research studies define, assess, and improve the quality of structured real-world health care data. METHODS A systematic literature search of studies in the English language was implemented in the Embase and PubMed databases to select studies that specifically aimed to measure and improve the quality of structured real-world data within any clinical setting. The time frame for the analysis was from January 1945 to June 2023. We standardized DQ concepts according to the Data Management Association (DAMA) DQ framework to enable comparison between studies. After screening and filtering by 2 independent authors, we identified 39 relevant articles reporting DQ improvement initiatives. RESULTS The studies were characterized by considerable heterogeneity in settings and approaches to DQ assessment and improvement. Affiliated institutions were from 18 different countries and 18 different health domains. DQ assessment methods were largely manual and targeted completeness and 1 other DQ dimension. Use of DQ frameworks was limited to the Weiskopf and Weng (3/6, 50%) or Kahn harmonized model (3/6, 50%). Use of standardized methodologies to design and implement quality improvement was lacking, but mainly included plan-do-study-act (PDSA) or define-measure-analyze-improve-control (DMAIC) cycles. Most studies reported DQ improvements using multiple interventions, which included either DQ reporting and personalized feedback (24/39, 61%), IT-related solutions (21/39, 54%), training (17/39, 44%), improvements in workflows (5/39, 13%), or data cleaning (3/39, 8%). Most studies reported improvements in DQ through a combination of these interventions. Statistical methods were used to determine significance of treatment effect (22/39, 56% times), but only 1 study implemented a randomized controlled study design. Variability in study designs, approaches to delivering interventions, and reporting DQ changes hindered a robust meta-analysis of treatment effects. CONCLUSIONS There is an urgent need for standardized guidelines in DQ improvement research to enable comparison and effective synthesis of lessons learned. Frameworks such as PDSA learning cycles and the DAMA DQ framework can facilitate this unmet need. In addition, DQ improvement studies can also benefit from prioritizing root cause analysis of DQ issues to ensure the most appropriate intervention is implemented, thereby ensuring long-term, sustainable improvement. Despite the rise in DQ improvement studies in the last decade, significant heterogeneity in methodologies and reporting remains a challenge. Adopting standardized frameworks for DQ assessment, analysis, and improvement can enhance the effectiveness, comparability, and generalizability of DQ improvement initiatives.
Collapse
Affiliation(s)
- Anthony Lighterness
- Clinical Outcomes and Data Unit, The Christie NHS Foundation Trust, Manchester, United Kingdom
| | - Michael Adcock
- Clinical Outcomes and Data Unit, The Christie NHS Foundation Trust, Manchester, United Kingdom
| | - Lauren Abigail Scanlon
- Clinical Outcomes and Data Unit, The Christie NHS Foundation Trust, Manchester, United Kingdom
| | - Gareth Price
- Radiotherapy Related Research Group, University of Manchester, Manchester, United Kingdom
| |
Collapse
|
2
|
Mobeen S, Fogel J, Harishankar K, Jacobs AJ. The COVID-19 Pandemic and Routine Prenatal Care: Use of Online Visits. Matern Child Health J 2024; 28:1219-1227. [PMID: 38270717 DOI: 10.1007/s10995-024-03904-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/19/2023] [Indexed: 01/26/2024]
Abstract
OBJECTIVE To evaluate whether prenatal visits or screening/testing were fewer or occurred later during the initial phase of the COVID-19 pandemic in 2020 (CINT) as compared to the prior year (PreCINT). METHODS A retrospective cohort study compared CINT (n = 2,195) to PreCINT (n = 2,395) at seven public hospitals in New York City. The primary outcome was total number of prenatal-care visits. Secondary outcomes were components of prenatal-care visits completion, timing of standard pregnancy screening tests, and adverse neonatal outcomes. RESULTS CINT patients had more total prenatal-care visits (B = 1.30, 95% CI:1.04, 1.56, p < 0.001), lower odds for initiation of prenatal care which was inadequate according to widely used criteria (OR:0.39, 95% CI:0.34, 0.45, p < 0.001), and lower gestational age at initial visit (B=-4.51, 95% CI:-5.10, -3.93, p < 0.001) than PreCINT patients. In-person visits did not differ between the two groups. PreCINT patients had no televisits, while CINT patients had a median of one televisit (Median = 1, p < 0.001). CINT patients had increased odds for group B Streptococcus screening (OR:1.27, 95% CI: 1.10, 1.48, p = 0.001), quadrivalent screening (OR:1.30, 95% CI:1.15, 1.48, p < 0.001), and anatomy sonogram (OR:2.30, 95% CI:2.04, 2.59, p < 0.001) but decreased odds for glucose challenge test screening (OR:0.81, 95% CI:0.72, 0.91, p < 0.001). Adverse neonatal outcome did not differ between CINT and PreCINT pregnancies. CONCLUSIONS FOR PRACTICE Despite the difficulties and perceived dangers of in-person visits during the COVID-19 pandemic, the COVID-19 pandemic had little negative impact upon the outpatient prenatal care received by patients in this hospital system.
Collapse
Affiliation(s)
- Sadia Mobeen
- Department of Obstetrics and Gynecology, South Brooklyn Health, 2601 Ocean Parkway, Brooklyn, New York, 11235, USA
| | - Joshua Fogel
- Department of Obstetrics and Gynecology, South Brooklyn Health, 2601 Ocean Parkway, Brooklyn, New York, 11235, USA
- Department of Management, Marketing, and Entrepreneurship, Brooklyn College, Brooklyn, New York, USA
| | - Krupa Harishankar
- Department of Obstetrics and Gynecology, Elmhurst Hospital Center, Queens, New York, USA
- Department of Obstetrics and Gynecology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Allan J Jacobs
- Department of Obstetrics and Gynecology, South Brooklyn Health, 2601 Ocean Parkway, Brooklyn, New York, 11235, USA.
- Department of Obstetrics and Gynecology, Downstate Medical Center, Brooklyn, New York, USA.
| |
Collapse
|
3
|
Declerck J, Kalra D, Vander Stichele R, Coorevits P. Frameworks, Dimensions, Definitions of Aspects, and Assessment Methods for the Appraisal of Quality of Health Data for Secondary Use: Comprehensive Overview of Reviews. JMIR Med Inform 2024; 12:e51560. [PMID: 38446534 PMCID: PMC10955383 DOI: 10.2196/51560] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Revised: 11/07/2023] [Accepted: 01/09/2024] [Indexed: 03/07/2024] Open
Abstract
BACKGROUND Health care has not reached the full potential of the secondary use of health data because of-among other issues-concerns about the quality of the data being used. The shift toward digital health has led to an increase in the volume of health data. However, this increase in quantity has not been matched by a proportional improvement in the quality of health data. OBJECTIVE This review aims to offer a comprehensive overview of the existing frameworks for data quality dimensions and assessment methods for the secondary use of health data. In addition, it aims to consolidate the results into a unified framework. METHODS A review of reviews was conducted including reviews describing frameworks of data quality dimensions and their assessment methods, specifically from a secondary use perspective. Reviews were excluded if they were not related to the health care ecosystem, lacked relevant information related to our research objective, and were published in languages other than English. RESULTS A total of 22 reviews were included, comprising 22 frameworks, with 23 different terms for dimensions, and 62 definitions of dimensions. All dimensions were mapped toward the data quality framework of the European Institute for Innovation through Health Data. In total, 8 reviews mentioned 38 different assessment methods, pertaining to 31 definitions of the dimensions. CONCLUSIONS The findings in this review revealed a lack of consensus in the literature regarding the terminology, definitions, and assessment methods for data quality dimensions. This creates ambiguity and difficulties in developing specific assessment methods. This study goes a step further by assigning all observed definitions to a consolidated framework of 9 data quality dimensions.
Collapse
Affiliation(s)
- Jens Declerck
- Department of Public Health and Primary Care, Unit of Medical Informatics and Statistics, Ghent University, Ghent, Belgium
- The European Institute for Innovation through Health Data, Ghent, Belgium
| | - Dipak Kalra
- Department of Public Health and Primary Care, Unit of Medical Informatics and Statistics, Ghent University, Ghent, Belgium
- The European Institute for Innovation through Health Data, Ghent, Belgium
| | - Robert Vander Stichele
- Faculty of Medicine and Health Sciences, Heymans Institute of Pharmacology, Ghent, Belgium
| | - Pascal Coorevits
- Department of Public Health and Primary Care, Unit of Medical Informatics and Statistics, Ghent University, Ghent, Belgium
| |
Collapse
|
4
|
Mountain R, Knight J, Heys K, Giorgi E, Gatheral T. Spatio-temporal modelling of referrals to outpatient respiratory clinics in the integrated care system of the Morecambe Bay area, England. BMC Health Serv Res 2024; 24:229. [PMID: 38388919 PMCID: PMC10882730 DOI: 10.1186/s12913-024-10716-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Accepted: 02/13/2024] [Indexed: 02/24/2024] Open
Abstract
BACKGROUND Promoting integrated care is a key goal of the NHS Long Term Plan to improve population respiratory health, yet there is limited data-driven evidence of its effectiveness. The Morecambe Bay Respiratory Network is an integrated care initiative operating in the North-West of England since 2017. A key target area has been reducing referrals to outpatient respiratory clinics by upskilling primary care teams. This study aims to explore space-time patterns in referrals from general practice in the Morecambe Bay area to evaluate the impact of the initiative. METHODS Data on referrals to outpatient clinics and chronic respiratory disease patient counts between 2012-2020 were obtained from the Morecambe Bay Community Data Warehouse, a large store of routinely collected healthcare data. For analysis, the data is aggregated by year and small area geography. The methodology comprises of two parts. The first explores the issues that can arise when using routinely collected primary care data for space-time analysis and applies spatio-temporal conditional autoregressive modelling to adjust for data complexities. The second part models the rate of outpatient referral via a Poisson generalised linear mixed model that adjusts for changes in demographic factors and number of respiratory disease patients. RESULTS The first year of the Morecambe Bay Respiratory Network was not associated with a significant difference in referral rate. However, the second and third years saw significant reductions in areas that had received intervention, with full intervention associated with a 31.8% (95% CI 17.0-43.9) and 40.5% (95% CI 27.5-50.9) decrease in referral rate in 2018 and 2019, respectively. CONCLUSIONS Routinely collected data can be used to robustly evaluate key outcome measures of integrated care. The results demonstrate that effective integrated care has real potential to ease the burden on respiratory outpatient services by reducing the need for an onward referral. This is of great relevance given the current pressure on outpatient services globally, particularly long waiting lists following the COVID-19 pandemic and the need for more innovative models of care.
Collapse
Affiliation(s)
| | - Jo Knight
- Lancaster Medical School, Lancaster University, Lancaster, UK
| | - Kelly Heys
- University Hospitals of Morecambe Bay NHS Foundation Trust, Westmorland General Hospital, Kendal, UK
| | - Emanuele Giorgi
- Lancaster Medical School, Lancaster University, Lancaster, UK
| | - Timothy Gatheral
- Lancaster Medical School, Lancaster University, Lancaster, UK
- University Hospitals of Morecambe Bay NHS Foundation Trust, Westmorland General Hospital, Kendal, UK
| |
Collapse
|
5
|
Leviton A, Loddenkemper T. Design, implementation, and inferential issues associated with clinical trials that rely on data in electronic medical records: a narrative review. BMC Med Res Methodol 2023; 23:271. [PMID: 37974111 PMCID: PMC10652539 DOI: 10.1186/s12874-023-02102-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Accepted: 11/08/2023] [Indexed: 11/19/2023] Open
Abstract
Real world evidence is now accepted by authorities charged with assessing the benefits and harms of new therapies. Clinical trials based on real world evidence are much less expensive than randomized clinical trials that do not rely on "real world evidence" such as contained in electronic health records (EHR). Consequently, we can expect an increase in the number of reports of these types of trials, which we identify here as 'EHR-sourced trials.' 'In this selected literature review, we discuss the various designs and the ethical issues they raise. EHR-sourced trials have the potential to improve/increase common data elements and other aspects of the EHR and related systems. Caution is advised, however, in drawing causal inferences about the relationships among EHR variables. Nevertheless, we anticipate that EHR-CTs will play a central role in answering research and regulatory questions.
Collapse
Affiliation(s)
- Alan Leviton
- Department of Neurology, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA.
| | - Tobias Loddenkemper
- Department of Neurology, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA
| |
Collapse
|
6
|
Syed R, Eden R, Makasi T, Chukwudi I, Mamudu A, Kamalpour M, Kapugama Geeganage D, Sadeghianasl S, Leemans SJJ, Goel K, Andrews R, Wynn MT, Ter Hofstede A, Myers T. Digital Health Data Quality Issues: Systematic Review. J Med Internet Res 2023; 25:e42615. [PMID: 37000497 PMCID: PMC10131725 DOI: 10.2196/42615] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Revised: 12/07/2022] [Accepted: 12/31/2022] [Indexed: 04/01/2023] Open
Abstract
BACKGROUND The promise of digital health is principally dependent on the ability to electronically capture data that can be analyzed to improve decision-making. However, the ability to effectively harness data has proven elusive, largely because of the quality of the data captured. Despite the importance of data quality (DQ), an agreed-upon DQ taxonomy evades literature. When consolidated frameworks are developed, the dimensions are often fragmented, without consideration of the interrelationships among the dimensions or their resultant impact. OBJECTIVE The aim of this study was to develop a consolidated digital health DQ dimension and outcome (DQ-DO) framework to provide insights into 3 research questions: What are the dimensions of digital health DQ? How are the dimensions of digital health DQ related? and What are the impacts of digital health DQ? METHODS Following the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines, a developmental systematic literature review was conducted of peer-reviewed literature focusing on digital health DQ in predominately hospital settings. A total of 227 relevant articles were retrieved and inductively analyzed to identify digital health DQ dimensions and outcomes. The inductive analysis was performed through open coding, constant comparison, and card sorting with subject matter experts to identify digital health DQ dimensions and digital health DQ outcomes. Subsequently, a computer-assisted analysis was performed and verified by DQ experts to identify the interrelationships among the DQ dimensions and relationships between DQ dimensions and outcomes. The analysis resulted in the development of the DQ-DO framework. RESULTS The digital health DQ-DO framework consists of 6 dimensions of DQ, namely accessibility, accuracy, completeness, consistency, contextual validity, and currency; interrelationships among the dimensions of digital health DQ, with consistency being the most influential dimension impacting all other digital health DQ dimensions; 5 digital health DQ outcomes, namely clinical, clinician, research-related, business process, and organizational outcomes; and relationships between the digital health DQ dimensions and DQ outcomes, with the consistency and accessibility dimensions impacting all DQ outcomes. CONCLUSIONS The DQ-DO framework developed in this study demonstrates the complexity of digital health DQ and the necessity for reducing digital health DQ issues. The framework further provides health care executives with holistic insights into DQ issues and resultant outcomes, which can help them prioritize which DQ-related problems to tackle first.
Collapse
Affiliation(s)
- Rehan Syed
- School of Information Systems, Faculty of Science, Queensland University of Technology, Brisbane, Australia
| | - Rebekah Eden
- School of Information Systems, Faculty of Science, Queensland University of Technology, Brisbane, Australia
| | - Tendai Makasi
- School of Information Systems, Faculty of Science, Queensland University of Technology, Brisbane, Australia
| | - Ignatius Chukwudi
- School of Information Systems, Faculty of Science, Queensland University of Technology, Brisbane, Australia
| | - Azumah Mamudu
- School of Information Systems, Faculty of Science, Queensland University of Technology, Brisbane, Australia
| | - Mostafa Kamalpour
- School of Information Systems, Faculty of Science, Queensland University of Technology, Brisbane, Australia
| | - Dakshi Kapugama Geeganage
- School of Information Systems, Faculty of Science, Queensland University of Technology, Brisbane, Australia
| | - Sareh Sadeghianasl
- School of Information Systems, Faculty of Science, Queensland University of Technology, Brisbane, Australia
| | - Sander J J Leemans
- Rheinisch-Westfälische Technische Hochschule, Aachen University, Aachen, Germany
| | - Kanika Goel
- School of Information Systems, Faculty of Science, Queensland University of Technology, Brisbane, Australia
| | - Robert Andrews
- School of Information Systems, Faculty of Science, Queensland University of Technology, Brisbane, Australia
| | - Moe Thandar Wynn
- School of Information Systems, Faculty of Science, Queensland University of Technology, Brisbane, Australia
| | - Arthur Ter Hofstede
- School of Information Systems, Faculty of Science, Queensland University of Technology, Brisbane, Australia
| | - Trina Myers
- School of Information Systems, Faculty of Science, Queensland University of Technology, Brisbane, Australia
| |
Collapse
|
7
|
Bots SH, Onland-Moret NC, den Ruijter HM. Addressing persistent evidence gaps in cardiovascular sex differences research - the potential of clinical care data. Front Glob Womens Health 2023; 3:1006425. [PMID: 36741297 PMCID: PMC9895823 DOI: 10.3389/fgwh.2022.1006425] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Accepted: 12/21/2022] [Indexed: 01/21/2023] Open
Abstract
Women have historically been underrepresented in cardiovascular clinical trials, resulting in a lack of sex-specific data. This is especially problematic in two situations, namely those where diseases manifest differently in women and men and those where biological differences between the sexes might affect the efficacy and/or safety of medication. There is therefore a pressing need for datasets with proper representation of women to address questions related to these situations. Clinical care data could fit this bill nicely because of their unique broad scope across both patient groups and clinical measures. This perspective piece presents the potential of clinical care data in sex differences research and discusses current challenges clinical care data-based research faces. It also suggests strategies to reduce the effect of these limitations, and explores whether clinical care data alone will be sufficient to close evidence gaps or whether a more comprehensive approach is needed.
Collapse
Affiliation(s)
- Sophie H. Bots
- Division of Pharmacoepidemiology and Clinical Pharmacology, Utrecht Institute for Pharmaceutical Sciences, Utrecht University, Utrecht, Netherlands,Laboratory for Experimental Cardiology, University Medical Center Utrecht, Utrecht University, Utrecht, Netherlands,Correspondence: Sophie H. Bots
| | - N. Charlotte Onland-Moret
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, Netherlands
| | - Hester M. den Ruijter
- Laboratory for Experimental Cardiology, University Medical Center Utrecht, Utrecht University, Utrecht, Netherlands
| |
Collapse
|
8
|
Keegan NM, Vasselman SE, Barnett ES, Nweji B, Carbone EA, Blum A, Morris MJ, Rathkopf DE, Slovin SF, Danila DC, Autio KA, Scher HI, Kantoff PW, Abida W, Stopsack KH. Clinical annotations for prostate cancer research: Defining data elements, creating a reproducible analytical pipeline, and assessing data quality. Prostate 2022; 82:1107-1116. [PMID: 35538298 PMCID: PMC9246896 DOI: 10.1002/pros.24363] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Revised: 04/04/2022] [Accepted: 04/20/2022] [Indexed: 12/13/2022]
Abstract
BACKGROUND Routine clinical data from clinical charts are indispensable for retrospective and prospective observational studies and clinical trials. Their reproducibility is often not assessed. We developed a prostate cancer-specific database for clinical annotations and evaluated data reproducibility. METHODS For men with prostate cancer who had clinical-grade paired tumor-normal sequencing at a comprehensive cancer center, we performed team-based retrospective data collection from the electronic medical record using a defined source hierarchy. We developed an open-source R package for data processing. With blinded repeat annotation by a reference medical oncologist, we assessed data completeness, reproducibility of team-based annotations, and impact of measurement error on bias in survival analyses. RESULTS Data elements on demographics, diagnosis and staging, disease state at the time of procuring a genomically characterized sample, and clinical outcomes were piloted and then abstracted for 2261 patients (with 2631 samples). Completeness of data elements was generally high. Comparing to the repeat annotation by a medical oncologist blinded to the database (100 patients/samples), reproducibility of annotations was high; T stage, metastasis date, and presence and date of castration resistance had lower reproducibility. Impact of measurement error on estimates for strong prognostic factors was modest. CONCLUSIONS With a prostate cancer-specific data dictionary and quality control measures, manual clinical annotations by a multidisciplinary team can be scalable and reproducible. The data dictionary and the R package for reproducible data processing are freely available to increase data quality and efficiency in clinical prostate cancer research.
Collapse
Affiliation(s)
- Niamh M. Keegan
- Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY
| | | | - Ethan S. Barnett
- Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY
| | - Barbara Nweji
- Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY
| | - Emily A. Carbone
- Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY
| | - Alexander Blum
- Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY
| | - Michael J. Morris
- Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY
- Weill Cornell Medical College, New York, NY
| | - Dana E. Rathkopf
- Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY
- Weill Cornell Medical College, New York, NY
| | - Susan F Slovin
- Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY
- Weill Cornell Medical College, New York, NY
| | - Daniel C. Danila
- Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY
- Weill Cornell Medical College, New York, NY
| | - Karen A. Autio
- Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY
- Weill Cornell Medical College, New York, NY
| | - Howard I. Scher
- Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY
- Weill Cornell Medical College, New York, NY
| | - Philip W. Kantoff
- Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY
- Weill Cornell Medical College, New York, NY
| | - Wassim Abida
- Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY
- Weill Cornell Medical College, New York, NY
- Correspondence: Wassim Abida and Konrad Stopsack, Department of Medicine, Memorial Sloan Kettering Cancer Center, 1275 York Ave, New York, NY 10065; Phone (646) 422-4633, and
| | - Konrad H. Stopsack
- Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY
| |
Collapse
|
9
|
Kaspar M, Fette G, Hanke M, Ertl M, Puppe F, Störk S. Automated provision of clinical routine data for a complex clinical follow-up study: A data warehouse solution. Health Informatics J 2022; 28:14604582211058081. [PMID: 34986681 DOI: 10.1177/14604582211058081] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
A deep integration of routine care and research remains challenging in many respects. We aimed to show the feasibility of an automated transformation and transfer process feeding deeply structured data with a high level of granularity collected for a clinical prospective cohort study from our hospital information system to the study's electronic data capture system, while accounting for study-specific data and visits. We developed a system integrating all necessary software and organizational processes then used in the study. The process and key system components are described together with descriptive statistics to show its feasibility in general and to identify individual challenges in particular. Data of 2051 patients enrolled between 2014 and 2020 was transferred. We were able to automate the transfer of approximately 11 million individual data values, representing 95% of all entered study data. These were recorded in n = 314 variables (28% of all variables), with some variables being used multiple times for follow-up visits. Our validation approach allowed for constant good data quality over the course of the study. In conclusion, the automated transfer of multi-dimensional routine medical data from HIS to study databases using specific study data and visit structures is complex, yet viable.
Collapse
Affiliation(s)
- Mathias Kaspar
- Comprehensive Heart Failure Center and Department of Internal Medicine I, 27207University and University Hospital Würzburg, Würzburg, Germany
- Department of Health Services Research, 11233Carl von Ossietzky University of Oldenburg, Oldenburg, Germany
| | - Georg Fette
- Service Center Medical Informatics, 27207Würzburg University Hospital, Würzburg, Germany
| | - Monika Hanke
- Comprehensive Heart Failure Center and Department of Internal Medicine I, 27207University and University Hospital Würzburg, Würzburg, Germany
| | - Maximilian Ertl
- Service Center Medical Informatics, 27207Würzburg University Hospital, Würzburg, Germany
| | - Frank Puppe
- Chair of Computer Science VI, 9190University of Würzburg, Würzburg, Germany
| | - Stefan Störk
- Comprehensive Heart Failure Center and Department of Internal Medicine I, 27207University and University Hospital Würzburg, Würzburg, Germany
| |
Collapse
|
10
|
Validation of algorithms for identifying outpatient infections in MS patients using electronic medical records. Mult Scler Relat Disord 2021; 57:103449. [PMID: 34915315 DOI: 10.1016/j.msard.2021.103449] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2021] [Revised: 11/19/2021] [Accepted: 12/02/2021] [Indexed: 11/20/2022]
Abstract
Background Our multiple sclerosis (MS) stakeholder groups expressed concerns about whether MS disease-modifying therapies (DMTs) increase the risk of specific outpatient infections. Validated methods for identifying the risk of these selected outpatient infections in the general population either do not exist, exclude the clinically important possibility of recurrent infections, or are inaccurate, largely because existing studies relied primarily on International Classification of Diseases (ICD) codes to identify infectious outcomes. Additionally, no studies have validated methods among the MS population, where some MS symptoms can be mistaken for infections (e.g., urinary tract infections (UTIs)). Objective To utilize multiple data elements in the electronic health record (EHR) to improve accurate identification of selected outpatient infections in an MS cohort and general population controls. Methods We searched Kaiser Permanente Southern California's EHR based on ICD-9/10 codes for specified outpatient infections from 1/1/2008-12/31/2018 among our MS cohort (n=6000) and 5:1 general population controls matched on age, sex, and race/ethnicity (n=30,010). Random sample chart abstractions from each group were used to identify common coding errors for outpatient pneumonia, upper and lower respiratory tract infection, UTIs, herpetic infections (herpes zoster (HZ), herpes simplex virus (HSV)), fungal infections, otitis media, cellulitis, and influenza. This information was used to define discrete infectious episodes and to identify the algorithm with the highest positive predictive value (PPV) after supplementing the ICD-coded episodes with radiology, laboratory and/or pharmacy data. Results PPVs relying on ICD codes alone were inaccurate, particularly for identifying recurrent herpetic infections (HZ (42%) and HSV (60%)), UTIs (42%) and outpatient pneumonia (20%) in MS patients. Defining and validating episodes improved the PPVs for all the selected infections. The final algorithms' PPVs were 80-100% in MS and 75-100% in the general population, after including dispensed treatments (UTI, herpetic infections and yeast vaginitis), timing of dispensed treatments (UTI, herpetic infections and yeast vaginitis), removal of prophylactic antiviral use (herpetic infections), and inclusion of selected laboratory (UTIs) and imaging results (pneumonia). The only exception was outpatient pneumonia, where PPVs improved but remained ≤70%. There were no significant differences in the PPVs for the final algorithms between the MS and general population. Conclusions Provided herein are accurate and validated algorithms that can be used to improve our understanding of how the risk of recurrent outpatient infections are influenced by MS treatments, MS-related disability, and co-morbidities. Findings from such studies will be important in helping patients and clinicians engage in shared decision-making and in developing strategies to mitigate risks of recurrent infections.
Collapse
|
11
|
Wang H, Belitskaya-Levy I, Wu F, Lee JS, Shih MC, Tsao PS, Lu Y. A statistical quality assessment method for longitudinal observations in electronic health record data with an application to the VA million veteran program. BMC Med Inform Decis Mak 2021; 21:289. [PMID: 34670548 PMCID: PMC8529838 DOI: 10.1186/s12911-021-01643-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2021] [Accepted: 09/21/2021] [Indexed: 11/10/2022] Open
Abstract
Background To describe an automated method for assessment of the plausibility of continuous variables collected in the electronic health record (EHR) data for real world evidence research use. Methods The most widely used approach in quality assessment (QA) for continuous variables is to detect the implausible numbers using prespecified thresholds. In augmentation to the thresholding method, we developed a score-based method that leverages the longitudinal characteristics of EHR data for detection of the observations inconsistent with the history of a patient. The method was applied to the height and weight data in the EHR from the Million Veteran Program Data from the Veteran’s Healthcare Administration (VHA). A validation study was also conducted. Results The receiver operating characteristic (ROC) metrics of the developed method outperforms the widely used thresholding method. It is also demonstrated that different quality assessment methods have a non-ignorable impact on the body mass index (BMI) classification calculated from height and weight data in the VHA’s database. Conclusions The score-based method enables automated and scaled detection of the problematic data points in health care big data while allowing the investigators to select the high-quality data based on their need. Leveraging the longitudinal characteristics in EHR will significantly improve the QA performance. Supplementary Information The online version contains supplementary material available at 10.1186/s12911-021-01643-2.
Collapse
Affiliation(s)
- Hui Wang
- Department of Veterans Affairs, Cooperative Studies Program Palo Alto Coordinating Center, 701B North Shoreline Blvd, Mountain View, CA, 94043, USA
| | - Ilana Belitskaya-Levy
- Department of Veterans Affairs, Cooperative Studies Program Palo Alto Coordinating Center, 701B North Shoreline Blvd, Mountain View, CA, 94043, USA
| | - Fan Wu
- Department of Veterans Affairs, Cooperative Studies Program Palo Alto Coordinating Center, 701B North Shoreline Blvd, Mountain View, CA, 94043, USA
| | - Jennifer S Lee
- Department of Veterans Affairs, Cooperative Studies Program Palo Alto Coordinating Center, 701B North Shoreline Blvd, Mountain View, CA, 94043, USA.,Department of Medicine, Stanford University School of Medicine, 1265 Welch Road, Stanford, CA, 94305-5464, USA.,Department of Epidemiology and Population Health, Stanford University School of Medicine, Stanford, CA, 94305, USA
| | - Mei-Chiung Shih
- Department of Veterans Affairs, Cooperative Studies Program Palo Alto Coordinating Center, 701B North Shoreline Blvd, Mountain View, CA, 94043, USA.,Department of Biomedical Data Science, Stanford University School of Medicine, 1265 Welch Road, X359, Stanford, CA, 94305-5464, USA
| | - Philip S Tsao
- Department of Veterans Affairs, Cooperative Studies Program Palo Alto Coordinating Center, 701B North Shoreline Blvd, Mountain View, CA, 94043, USA.,Department of Medicine, Stanford University School of Medicine, 1265 Welch Road, Stanford, CA, 94305-5464, USA
| | - Ying Lu
- Department of Veterans Affairs, Cooperative Studies Program Palo Alto Coordinating Center, 701B North Shoreline Blvd, Mountain View, CA, 94043, USA. .,Department of Epidemiology and Population Health, Stanford University School of Medicine, Stanford, CA, 94305, USA. .,Department of Biomedical Data Science, Stanford University School of Medicine, 1265 Welch Road, X359, Stanford, CA, 94305-5464, USA.
| | | |
Collapse
|
12
|
Blacketer C, Defalco FJ, Ryan PB, Rijnbeek PR. Increasing trust in real-world evidence through evaluation of observational data quality. J Am Med Inform Assoc 2021; 28:2251-2257. [PMID: 34313749 PMCID: PMC8449628 DOI: 10.1093/jamia/ocab132] [Citation(s) in RCA: 39] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Revised: 05/25/2021] [Accepted: 06/15/2021] [Indexed: 11/13/2022] Open
Abstract
Objective Advances in standardization of observational healthcare data have enabled methodological breakthroughs, rapid global collaboration, and generation of real-world evidence to improve patient outcomes. Standardizations in data structure, such as use of common data models, need to be coupled with standardized approaches for data quality assessment. To ensure confidence in real-world evidence generated from the analysis of real-world data, one must first have confidence in the data itself. Materials and Methods We describe the implementation of check types across a data quality framework of conformance, completeness, plausibility, with both verification and validation. We illustrate how data quality checks, paired with decision thresholds, can be configured to customize data quality reporting across a range of observational health data sources. We discuss how data quality reporting can become part of the overall real-world evidence generation and dissemination process to promote transparency and build confidence in the resulting output. Results The Data Quality Dashboard is an open-source R package that reports potential quality issues in an OMOP CDM instance through the systematic execution and summarization of over 3300 configurable data quality checks. Discussion Transparently communicating how well common data model-standardized databases adhere to a set of quality measures adds a crucial piece that is currently missing from observational research. Conclusion Assessing and improving the quality of our data will inherently improve the quality of the evidence we generate.
Collapse
Affiliation(s)
- Clair Blacketer
- Observational Health Data Analytics, Janssen Research and Development, LLC, Titusville, New Jersey, USA.,Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Frank J Defalco
- Observational Health Data Analytics, Janssen Research and Development, LLC, Titusville, New Jersey, USA
| | - Patrick B Ryan
- Observational Health Data Analytics, Janssen Research and Development, LLC, Titusville, New Jersey, USA.,Department of Biomedical Informatics, Columbia University, New York, New York, USA
| | - Peter R Rijnbeek
- Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
| |
Collapse
|
13
|
Biggin F, Howcroft T, Davies Q, Knight J, Emsley HCA. Variation in waiting times by diagnostic category: an observational study of 1,951 referrals to a neurology outpatient clinic. BMJ Neurol Open 2021; 3:e000133. [PMID: 34151270 PMCID: PMC8183200 DOI: 10.1136/bmjno-2021-000133] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Revised: 04/01/2021] [Accepted: 04/06/2021] [Indexed: 11/08/2022] Open
Abstract
OBJECTIVE To investigate the frequency of diagnoses seen among new referrals to neurology outpatient services; to understand how these services are used through exploratory analysis of diagnostic tests and follow-up appointments; and to examine the waiting times between referral and appointment. METHODS Routine data from new National Health Service appointments at a single consultant-delivered clinic between September 2016 and January 2019 were collected. These clinical data were then linked to hospital administrative data. The combined data were assigned diagnostic categories based on working diagnoses to allow further analysis using descriptive statistics. RESULTS Five diagnostic categories accounted for 62% of all patients seen within the study period, the most common of which was headache disorders. Following a first appointment, 50% of all patients were offered at least one diagnostic test, and 35% were offered a follow-up appointment, with variation in both measures by diagnostic category. Waiting times from referral to appointment also varied by diagnostic category. 65% of patients with a seizure/epilepsy disorder were seen within the 18-week referral to treatment target, compared with 38% of patients with a movement disorder. CONCLUSIONS A small number of diagnostic categories account for a large proportion of new patients. This information could be used in policy decision-making to describe a minimum subset of categories for diagnostic coding. We found significant differences in waiting times by diagnostic category, as well as tests ordered, and follow-up offered; further investigation could address causes of variation.
Collapse
Affiliation(s)
- Fran Biggin
- Lancaster Medical School, Lancaster University Faculty of Health and Medicine, Lancaster, UK
| | - Timothy Howcroft
- Health Informatics, Lancashire Teaching Hospitals NHS Foundation Trust, Preston, UK
| | - Quinta Davies
- Health Informatics, Lancashire Teaching Hospitals NHS Foundation Trust, Preston, UK
| | - Jo Knight
- Lancaster Medical School, Lancaster University Faculty of Health and Medicine, Lancaster, UK
| | - Hedley C A Emsley
- Lancaster Medical School, Lancaster University Faculty of Health and Medicine, Lancaster, UK
- Department of Neurology, Lancashire Teaching Hospitals NHS Foundation Trust, Preston, UK
| |
Collapse
|
14
|
Abstract
OBJECTIVES Clinical Research Informatics (CRI) declares its scope in its name, but its content, both in terms of the clinical research it supports-and sometimes initiates-and the methods it has developed over time, reach much further than the name suggests. The goal of this review is to celebrate the extraordinary diversity of activity and of results, not as a prize-giving pageant, but in recognition of the field, the community that both serves and is sustained by it, and of its interdisciplinarity and its international dimension. METHODS Beyond personal awareness of a range of work commensurate with the author's own research, it is clear that, even with a thorough literature search, a comprehensive review is impossible. Moreover, the field has grown and subdivided to an extent that makes it very hard for one individual to be familiar with every branch or with more than a few branches in any depth. A literature survey was conducted that focused on informatics-related terms in the general biomedical and healthcare literature, and specific concerns ("artificial intelligence", "data models", "analytics", etc.) in the biomedical informatics (BMI) literature. In addition to a selection from the results from these searches, suggestive references within them were also considered. RESULTS The substantive sections of the paper-Artificial Intelligence, Machine Learning, and "Big Data" Analytics; Common Data Models, Data Quality, and Standards; Phenotyping and Cohort Discovery; Privacy: Deidentification, Distributed Computation, Blockchain; Causal Inference and Real-World Evidence-provide broad coverage of these active research areas, with, no doubt, a bias towards this reviewer's interests and preferences, landing on a number of papers that stood out in one way or another, or, alternatively, exemplified a particular line of work. CONCLUSIONS CRI is thriving, not only in the familiar major centers of research, but more widely, throughout the world. This is not to pretend that the distribution is uniform, but to highlight the potential for this domain to play a prominent role in supporting progress in medicine, healthcare, and wellbeing everywhere. We conclude with the observation that CRI and its practitioners would make apt stewards of the new medical knowledge that their methods will bring forward.
Collapse
Affiliation(s)
- Anthony Solomonides
- Outcomes Research Network, Research Institute, NorthShore University HealthSystem, Evanston, IL, USA
| |
Collapse
|