1
|
Chang TH, Chen YD, Lu HHS, Wu JL, Mak K, Yu CS. Specific patterns and potential risk factors to predict 3-year risk of death among non-cancer patients with advanced chronic kidney disease by machine learning. Medicine (Baltimore) 2024; 103:e37112. [PMID: 38363886 PMCID: PMC10869094 DOI: 10.1097/md.0000000000037112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Accepted: 01/09/2024] [Indexed: 02/18/2024] Open
Abstract
Chronic kidney disease (CKD) is a major public health concern. But there are limited machine learning studies on non-cancer patients with advanced CKD, and the results of machine learning studies on cancer patients with CKD may not apply directly on non-cancer patients. We aimed to conduct a comprehensive investigation of risk factors for a 3-year risk of death among non-cancer advanced CKD patients with an estimated glomerular filtration rate < 60.0 mL/min/1.73m2 by several machine learning algorithms. In this retrospective cohort study, we collected data from in-hospital and emergency care patients from 2 hospitals in Taiwan from 2009 to 2019, including their international classification of disease at admission and laboratory data from the hospital's electronic medical records (EMRs). Several machine learning algorithms were used to analyze the potential impact and degree of influence of each factor on mortality and survival. Data from 2 hospitals in northern Taiwan were collected with 6565 enrolled patients. After data cleaning, 26 risk factors and approximately 3887 advanced CKD patients from Shuang Ho Hospital were used as the training set. The validation set contained 2299 patients from Taipei Medical University Hospital. Predictive variables, such as albumin, PT-INR, and age, were the top 3 significant risk factors with paramount influence on mortality prediction. In the receiver operating characteristic curve, the random forest had the highest values for accuracy above 0.80. MLP, and Adaboost had better performance on sensitivity and F1-score compared to other methods. Additionally, SVM with linear kernel function had the highest specificity of 0.9983, while its sensitivity and F1-score were poor. Logistic regression had the best performance, with an area under the curve of 0.8527. Evaluating Taiwanese advanced CKD patients' EMRs could provide physicians with a good approximation of the patients' 3-year risk of death by machine learning algorithms.
Collapse
Affiliation(s)
- Tzu-Hao Chang
- Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan
- Clinical Big Data Research Center, Taipei Medical University Hospital, Taipei, Taiwan
| | - Yu-Da Chen
- Department of Family Medicine, Taipei Medical University Hospital, Taipei, Taiwan
- School of Health Care Administration, College of Management, Taipei Medical University, Taipei, Taiwan
- Department of Family Medicine, School of Medicine, College of Medicine, Taipei Medical University, Taipei, Taiwan
| | - Henry Horng-Shing Lu
- Institute of Statistics, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
- Institute of Data Science and Engineering, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
| | - Jenny L. Wu
- Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan
| | | | - Cheng-Sheng Yu
- Graduate Institute of Data Science, College of Management, Taipei Medical University, Taipei, Taiwan
- Clinical Data Center, Office of Data Science, Taipei Medical University, Taipei, Taiwan
- Fintech RD Center, Nan Shan Life Insurance Co., Ltd
| |
Collapse
|
2
|
Lu Y, Ning Y, Li Y, Zhu B, Zhang J, Yang Y, Chen W, Yan Z, Chen A, Shen B, Fang Y, Wang D, Song N, Ding X. Risk factor mining and prediction of urine protein progression in chronic kidney disease: a machine learning- based study. BMC Med Inform Decis Mak 2023; 23:173. [PMID: 37653403 PMCID: PMC10472702 DOI: 10.1186/s12911-023-02269-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Accepted: 08/17/2023] [Indexed: 09/02/2023] Open
Abstract
BACKGROUND Chronic kidney disease (CKD) is a global public health concern. Therefore, to provide timely intervention for non-hospitalized high-risk patients and rationally allocate limited clinical resources is important to mine the key factors when designing a CKD prediction model. METHODS This study included data from 1,358 patients with CKD pathologically confirmed during the period from December 2017 to September 2020 at Zhongshan Hospital. A CKD prediction interpretation framework based on machine learning was proposed. From among 100 variables, 17 were selected for the model construction through a recursive feature elimination with logistic regression feature screening. Several machine learning classifiers, including extreme gradient boosting, gaussian-based naive bayes, a neural network, ridge regression, and linear model logistic regression (LR), were trained, and an ensemble model was developed to predict 24-hour urine protein. The detailed relationship between the risk of CKD progression and these predictors was determined using a global interpretation. A patient-specific analysis was conducted using a local interpretation. RESULTS The results showed that LR achieved the best performance, with an area under the curve (AUC) of 0.850 in a single machine learning model. The ensemble model constructed using the voting integration method further improved the AUC to 0.856. The major predictors of moderate-to-severe severity included lower levels of 25-OH-vitamin, albumin, transferrin in males, and higher levels of cystatin C. CONCLUSIONS Compared with the clinical single kidney function evaluation indicators (eGFR, Scr), the machine learning model proposed in this study improved the prediction accuracy of CKD progression by 17.6% and 24.6%, respectively, and the AUC was improved by 0.250 and 0.236, respectively. Our framework can achieve a good predictive interpretation and provide effective clinical decision support.
Collapse
Affiliation(s)
- Yufei Lu
- Department of Nephrology, Zhongshan Hospital, Fudan University, Shanghai Clinical Research Center for Kidney Disease, Shanghai Medical Center of Kidney, Shanghai Institute of Kidney and Dialysis, Shanghai Key Laboratory of Kidney and Blood Purification, Hemodialysis Quality Control Center of Shanghai, Shanghai, China
| | - Yichun Ning
- Department of Nephrology, Zhongshan Hospital, Fudan University, Shanghai Clinical Research Center for Kidney Disease, Shanghai Medical Center of Kidney, Shanghai Institute of Kidney and Dialysis, Shanghai Key Laboratory of Kidney and Blood Purification, Hemodialysis Quality Control Center of Shanghai, Shanghai, China
| | - Yang Li
- Department of Nephrology, Zhongshan Hospital, Fudan University, Shanghai Clinical Research Center for Kidney Disease, Shanghai Medical Center of Kidney, Shanghai Institute of Kidney and Dialysis, Shanghai Key Laboratory of Kidney and Blood Purification, Hemodialysis Quality Control Center of Shanghai, Shanghai, China
| | - Bowen Zhu
- Department of Nephrology, Zhongshan Hospital, Fudan University, Shanghai Clinical Research Center for Kidney Disease, Shanghai Medical Center of Kidney, Shanghai Institute of Kidney and Dialysis, Shanghai Key Laboratory of Kidney and Blood Purification, Hemodialysis Quality Control Center of Shanghai, Shanghai, China
| | - Jian Zhang
- Department of Nephrology, Zhongshan Hospital, Fudan University, Shanghai Clinical Research Center for Kidney Disease, Shanghai Medical Center of Kidney, Shanghai Institute of Kidney and Dialysis, Shanghai Key Laboratory of Kidney and Blood Purification, Hemodialysis Quality Control Center of Shanghai, Shanghai, China
| | - Yan Yang
- Department of Nephrology, Zhongshan Hospital, Fudan University, Shanghai Clinical Research Center for Kidney Disease, Shanghai Medical Center of Kidney, Shanghai Institute of Kidney and Dialysis, Shanghai Key Laboratory of Kidney and Blood Purification, Hemodialysis Quality Control Center of Shanghai, Shanghai, China
| | - Weize Chen
- Department of Nephrology, Zhongshan Hospital, Fudan University, Shanghai Clinical Research Center for Kidney Disease, Shanghai Medical Center of Kidney, Shanghai Institute of Kidney and Dialysis, Shanghai Key Laboratory of Kidney and Blood Purification, Hemodialysis Quality Control Center of Shanghai, Shanghai, China
| | - Zhixin Yan
- Department of Nephrology, Zhongshan Hospital, Fudan University, Shanghai Clinical Research Center for Kidney Disease, Shanghai Medical Center of Kidney, Shanghai Institute of Kidney and Dialysis, Shanghai Key Laboratory of Kidney and Blood Purification, Hemodialysis Quality Control Center of Shanghai, Shanghai, China
| | - Annan Chen
- Department of Nephrology, Zhongshan Hospital, Fudan University, Shanghai Clinical Research Center for Kidney Disease, Shanghai Medical Center of Kidney, Shanghai Institute of Kidney and Dialysis, Shanghai Key Laboratory of Kidney and Blood Purification, Hemodialysis Quality Control Center of Shanghai, Shanghai, China
| | - Bo Shen
- Department of Nephrology, Zhongshan Hospital, Fudan University, Shanghai Clinical Research Center for Kidney Disease, Shanghai Medical Center of Kidney, Shanghai Institute of Kidney and Dialysis, Shanghai Key Laboratory of Kidney and Blood Purification, Hemodialysis Quality Control Center of Shanghai, Shanghai, China
| | - Yi Fang
- Department of Nephrology, Zhongshan Hospital, Fudan University, Shanghai Clinical Research Center for Kidney Disease, Shanghai Medical Center of Kidney, Shanghai Institute of Kidney and Dialysis, Shanghai Key Laboratory of Kidney and Blood Purification, Hemodialysis Quality Control Center of Shanghai, Shanghai, China
| | - Dong Wang
- School of Computer Science & Information Engineering, Shanghai Institute of Technology, Shanghai, China.
| | - Nana Song
- Department of Nephrology, Zhongshan Hospital, Fudan University, Shanghai Clinical Research Center for Kidney Disease, Shanghai Medical Center of Kidney, Shanghai Institute of Kidney and Dialysis, Shanghai Key Laboratory of Kidney and Blood Purification, Hemodialysis Quality Control Center of Shanghai, Shanghai, China.
| | - Xiaoqiang Ding
- Department of Nephrology, Zhongshan Hospital, Fudan University, Shanghai Clinical Research Center for Kidney Disease, Shanghai Medical Center of Kidney, Shanghai Institute of Kidney and Dialysis, Shanghai Key Laboratory of Kidney and Blood Purification, Hemodialysis Quality Control Center of Shanghai, Shanghai, China.
| |
Collapse
|
3
|
Predict, diagnose, and treat chronic kidney disease with machine learning: a systematic literature review. J Nephrol 2023; 36:1101-1117. [PMID: 36786976 PMCID: PMC10227138 DOI: 10.1007/s40620-023-01573-4] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2022] [Accepted: 01/01/2023] [Indexed: 02/15/2023]
Abstract
OBJECTIVES In this systematic review we aimed at assessing how artificial intelligence (AI), including machine learning (ML) techniques have been deployed to predict, diagnose, and treat chronic kidney disease (CKD). We systematically reviewed the available evidence on these innovative techniques to improve CKD diagnosis and patient management. METHODS We included English language studies retrieved from PubMed. The review is therefore to be classified as a "rapid review", since it includes one database only, and has language restrictions; the novelty and importance of the issue make missing relevant papers unlikely. We extracted 16 variables, including: main aim, studied population, data source, sample size, problem type (regression, classification), predictors used, and performance metrics. We followed the Preferred Reporting Items for Systematic Reviews (PRISMA) approach; all main steps were done in duplicate. RESULTS From a total of 648 studies initially retrieved, 68 articles met the inclusion criteria. Models, as reported by authors, performed well, but the reported metrics were not homogeneous across articles and therefore direct comparison was not feasible. The most common aim was prediction of prognosis, followed by diagnosis of CKD. Algorithm generalizability, and testing on diverse populations was rarely taken into account. Furthermore, the clinical evaluation and validation of the models/algorithms was perused; only a fraction of the included studies, 6 out of 68, were performed in a clinical context. CONCLUSIONS Machine learning is a promising tool for the prediction of risk, diagnosis, and therapy management for CKD patients. Nonetheless, future work is needed to address the interpretability, generalizability, and fairness of the models to ensure the safe application of such technologies in routine clinical practice.
Collapse
|
4
|
Early Detection and Diagnosis of Chronic Kidney Disease Based on Selected Predominant Features. JOURNAL OF HEALTHCARE ENGINEERING 2023; 2023:3553216. [PMID: 36756136 PMCID: PMC9902122 DOI: 10.1155/2023/3553216] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Revised: 04/19/2022] [Accepted: 11/25/2022] [Indexed: 02/03/2023]
Abstract
In numerous perilous cases, a quick medical decision is needed for the early detection of chronic diseases to avoid austere consequences that may be fatal. Chronic kidney disease (CKD) is a prevalent disease that presents a variety of challenges, including soaring costs for intervention, urgency, and, more importantly, difficulty in early detection of the disease. The current study carries out a prediction-based method that helps in detecting and diagnosing CKD patients which enables a fast and accurate decision-making process at the early stage. A combination of preprocessing and feature selection methods was developed; additionally, several prediction models, such as K-nearest neighbor (KNN), support vector machine (SVM), random forest (RF), and bagging, were trained based on the processed dataset. The performance evaluation shows higher reliability of all models in terms of accuracy, precision, sensitivity, F-measure, specificity, and area under the curve (AUC) score. Specifically, KNN outperformed with an accuracy of 99.50%, sensitivity of 99.2%, precision of 100%, specificity of 98.7%, and F-measure and AUC score of 99.6%. The experimental results of KNN show the best fitted model compared to the existing state-of-the-art methods. Moreover, the reduced feature set proves that just a few clinical tests are enough to detect CKD, resulting in diagnosis cost reduction.
Collapse
|
5
|
Ebiaredoh-Mienye SA, Swart TG, Esenogho E, Mienye ID. A Machine Learning Method with Filter-Based Feature Selection for Improved Prediction of Chronic Kidney Disease. Bioengineering (Basel) 2022; 9:350. [PMID: 36004875 PMCID: PMC9405039 DOI: 10.3390/bioengineering9080350] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2022] [Revised: 07/06/2022] [Accepted: 07/21/2022] [Indexed: 11/25/2022] Open
Abstract
The high prevalence of chronic kidney disease (CKD) is a significant public health concern globally. The condition has a high mortality rate, especially in developing countries. CKD often go undetected since there are no obvious early-stage symptoms. Meanwhile, early detection and on-time clinical intervention are necessary to reduce the disease progression. Machine learning (ML) models can provide an efficient and cost-effective computer-aided diagnosis to assist clinicians in achieving early CKD detection. This research proposed an approach to effectively detect CKD by combining the information-gain-based feature selection technique and a cost-sensitive adaptive boosting (AdaBoost) classifier. An approach like this could save CKD screening time and cost since only a few clinical test attributes would be needed for the diagnosis. The proposed approach was benchmarked against recently proposed CKD prediction methods and well-known classifiers. Among these classifiers, the proposed cost-sensitive AdaBoost trained with the reduced feature set achieved the best classification performance with an accuracy, sensitivity, and specificity of 99.8%, 100%, and 99.8%, respectively. Additionally, the experimental results show that the feature selection positively impacted the performance of the various classifiers. The proposed approach has produced an effective predictive model for CKD diagnosis and could be applied to more imbalanced medical datasets for effective disease detection.
Collapse
Affiliation(s)
- Sarah A. Ebiaredoh-Mienye
- Center for Telecommunications, Department of Electrical and Electronic Engineering Science, University of Johannesburg, Johannesburg 2006, South Africa; (S.A.E.-M.); (E.E.)
| | - Theo G. Swart
- Center for Telecommunications, Department of Electrical and Electronic Engineering Science, University of Johannesburg, Johannesburg 2006, South Africa; (S.A.E.-M.); (E.E.)
| | - Ebenezer Esenogho
- Center for Telecommunications, Department of Electrical and Electronic Engineering Science, University of Johannesburg, Johannesburg 2006, South Africa; (S.A.E.-M.); (E.E.)
| | - Ibomoiye Domor Mienye
- Department of Electrical and Electronic Engineering Science, University of Johannesburg, Johannesburg 2006, South Africa;
| |
Collapse
|