Zhang X, Zhang X, Zhang D, Xu J, Zhang J, Zhang X. The clinical prediction model to distinguish between colonization and infection by
Klebsiella pneumoniae.
Front Microbiol 2025;
15:1508030. [PMID:
39917270 PMCID:
PMC11800808 DOI:
10.3389/fmicb.2024.1508030]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2024] [Accepted: 12/30/2024] [Indexed: 02/09/2025] Open
Abstract
Objective
To develop a machine learning-based prediction model to assist clinicians in accurately determining whether the detection of Klebsiella pneumoniae (KP) in sputum samples indicates an infection, facilitating timely diagnosis and treatment.
Research methods
A retrospective analysis was conducted on 8,318 patients with KP cultures admitted to a tertiary hospital in Northeast China from January 2019 to December 2023. After excluding duplicates, other specimen types, cases with substandard specimen quality, and mixed infections, 286 cases with sputum cultures yielding only KP were included, comprising 67 cases in the colonization group and 219 cases in the infection group. Antimicrobial susceptibility testing was performed on the included strains, and through univariate logistic regression analysis, 15 key influencing factors were identified, including: age > 62 years, ESBL, CRKP, number of positive sputum cultures for KP, history of tracheostomy, use of mechanical ventilation for >96 h, indwelling gastric tube, history of craniotomy, recent local glucocorticoid application, altered consciousness, bedridden state, diagnosed with respiratory infectious disease upon admission, electrolyte disorder, hypoalbuminemia, and admission to ICU (all p < 0.05). These factors were used to construct the model, which was evaluated using accuracy, precision, recall, F1 score, AUC value, and Brier score.
Results
Antimicrobial susceptibility testing indicated that the resistance rates for penicillins, cephalosporins, carbapenems, and quinolones were significantly higher in the infection group compared to the colonization group (all p < 0.05). Six predictive models were constructed using 15 key influencing factors, including Classification and Regression Trees (CART), C5.0, Gradient Boosting Machines (GBM), Support Vector Machines (SVM), Random Forest (RF), and Nomogram. The Random Forest model performed best among all indicators (accuracy 0.93, precision 0.98, Brier Score 0.06, recall 0.72, F1 Score 0.83, AUC 0.99). The importance of each factor was demonstrated using mean decrease in Gini. "Admitted with a diagnosis of respiratory infectious disease" (8.39) was identified as the most important factor in the model, followed by "Hypoalbuminemia" (7.83), then "ESBL" (7.06), "Electrolyte Imbalance" (5.81), "Age > 62 years" (5.24), "The number of Positive Sputum Cultures for KP > 2" (4.77), and being bedridden (4.24). Additionally, invasive procedures (such as history of tracheostomy, use of ventilators for >96 h, and craniotomy) were also significant predictive factors. The Nomogram indicated that CRKP, presence of a nasogastric tube, admission to the ICU, and history of tracheostomy were important factors in determining KP colonization.
Conclusion
The Random Forest model effectively distinguishes between infection and colonization status of KP, while the Nomogram visually presents the predictive value of various factors, providing clinicians with a reference for formulating treatment plans. In the future, the accuracy of infection diagnosis can be further enhanced through artificial intelligence technology to optimize treatment strategies, thereby improving patient prognosis and reducing healthcare burdens.
Collapse