Yang D, Kim J, Yoo J, Cha WC, Paik H. Identifying the Risk of Sepsis in Patients With Cancer Using Digital Health Care Records: Machine Learning-Based Approach.
JMIR Med Inform 2022;
10:e37689. [PMID:
35704364 PMCID:
PMC9244654 DOI:
10.2196/37689]
[Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2022] [Revised: 04/18/2022] [Accepted: 05/17/2022] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND
Sepsis is diagnosed in millions of people every year, resulting in a high mortality rate. Although patients with sepsis present multimorbid conditions, including cancer, sepsis predictions have mainly focused on patients with severe injuries.
OBJECTIVE
In this paper, we present a machine learning-based approach to identify the risk of sepsis in patients with cancer using electronic health records (EHRs).
METHODS
We utilized deidentified anonymized EHRs of 8580 patients with cancer from the Samsung Medical Center in Korea in a longitudinal manner between 2014 and 2019. To build a prediction model based on physical status that would differ between sepsis and nonsepsis patients, we analyzed 2462 laboratory test results and 2266 medication prescriptions using graph network and statistical analyses. The medication relationships and lab test results from each analysis were used as additional learning features to train our predictive model.
RESULTS
Patients with sepsis showed differential medication trajectories and physical status. For example, in the network-based analysis, narcotic analgesics were prescribed more often in the sepsis group, along with other drugs. Likewise, 35 types of lab tests, including albumin, globulin, and prothrombin time, showed significantly different distributions between sepsis and nonsepsis patients (P<.001). Our model outperformed the model trained using only common EHRs, showing an improved accuracy, area under the receiver operating characteristic (AUROC), and F1 score by 11.9%, 11.3%, and 13.6%, respectively. For the random forest-based model, the accuracy, AUROC, and F1 score were 0.692, 0.753, and 0.602, respectively.
CONCLUSIONS
We showed that lab tests and medication relationships can be used as efficient features for predicting sepsis in patients with cancer. Consequently, identifying the risk of sepsis in patients with cancer using EHRs and machine learning is feasible.
Collapse