Dolezel D, Beauvais B, Stigler Granados P, Fulton L, Kruse CS. Effects of Internal and External Factors on Hospital Data Breaches: Quantitative Study.
J Med Internet Res 2023;
25:e51471. [PMID:
38127426 PMCID:
PMC10767628 DOI:
10.2196/51471]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Revised: 11/06/2023] [Accepted: 11/13/2023] [Indexed: 12/23/2023] Open
Abstract
BACKGROUND
Health care data breaches are the most rapidly increasing type of cybercrime; however, the predictors of health care data breaches are uncertain.
OBJECTIVE
This quantitative study aims to develop a predictive model to explain the number of hospital data breaches at the county level.
METHODS
This study evaluated data consolidated at the county level from 1032 short-term acute care hospitals. We considered the association between data breach occurrence (a dichotomous variable), predictors based on county demographics, and socioeconomics, average hospital workload, facility type, and average performance on several hospital financial metrics using 3 model types: logistic regression, perceptron, and support vector machine.
RESULTS
The model coefficient performance metrics indicated convergent validity across the 3 model types for all variables except bad debt and the factor level accounting for counties with >20% and up to 40% Hispanic populations, both of which had mixed coefficient directionality. The support vector machine model performed the classification task best based on all metrics (accuracy, precision, recall, F1-score). All the 3 models performed the classification task well with directional congruence of weights. From the logistic regression model, the top 5 odds ratios (indicating a higher risk of breach) included inpatient workload, medical center status, pediatric trauma center status, accounts receivable, and the number of outpatient visits, in high to low order. The bottom 5 odds ratios (indicating the lowest odds of experiencing a data breach) occurred for counties with Black populations of >20% and <40%, >80% and <100%, and >40% but <60%, as well as counties with ≤20% Asian or between 80% and 100% Hispanic individuals. Our results are in line with those of other studies that determined that patient workload, facility type, and financial outcomes were associated with the likelihood of health care data breach occurrence.
CONCLUSIONS
The results of this study provide a predictive model for health care data breaches that may guide health care managers to reduce the risk of data breaches by raising awareness of the risk factors.
Collapse