1
|
Darabi P, Gharibzadeh S, Khalili D, Bagherpour-Kalo M, Janani L. Optimizing cardiovascular disease mortality prediction: a super learner approach in the tehran lipid and glucose study. BMC Med Inform Decis Mak 2024; 24:97. [PMID: 38627734 PMCID: PMC11020797 DOI: 10.1186/s12911-024-02489-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Accepted: 03/22/2024] [Indexed: 04/19/2024] Open
Abstract
BACKGROUND & AIM Cardiovascular disease (CVD) is the most important cause of death in the world and has a potential impact on health care costs, this study aimed to evaluate the performance of machine learning survival models and determine the optimum model for predicting CVD-related mortality. METHOD In this study, the research population was all participants in Tehran Lipid and Glucose Study (TLGS) aged over 30 years. We used the Gradient Boosting model (GBM), Support Vector Machine (SVM), Super Learner (SL), and Cox proportional hazard (Cox-PH) models to predict the CVD-related mortality using 26 features. The dataset was randomly divided into training (80%) and testing (20%). To evaluate the performance of the methods, we used the Brier Score (BS), Prediction Error (PE), Concordance Index (C-index), and time-dependent Area Under the Curve (TD-AUC) criteria. Four different clinical models were also performed to improve the performance of the methods. RESULTS Out of 9258 participants with a mean age of (SD; range) 43.74 (15.51; 20-91), 56.60% were female. The CVD death proportion was 2.5% (228 participants). The death proportion was significantly higher in men (67.98% M, 32.02% F). Based on predefined selection criteria, the SL method has the best performance in predicting CVD-related mortality (TD-AUC > 93.50%). Among the machine learning (ML) methods, The SVM has the worst performance (TD-AUC = 90.13%). According to the relative effect, age, fasting blood sugar, systolic blood pressure, smoking, taking aspirin, diastolic blood pressure, Type 2 diabetes mellitus, hip circumference, body mss index (BMI), and triglyceride were identified as the most influential variables in predicting CVD-related mortality. CONCLUSION According to the results of our study, compared to the Cox-PH model, Machine Learning models showed promising and sometimes better performance in predicting CVD-related mortality. This finding is based on the analysis of a large and diverse urban population from Tehran, Iran.
Collapse
Affiliation(s)
- Parvaneh Darabi
- Department of Biostatistics, School of Public Health, Iran University of Medical Sciences, Tehran, Iran
| | - Safoora Gharibzadeh
- Department of Epidemiology and Biostatistics, Pasteur Institute of Iran, Tehran, Iran.
| | - Davood Khalili
- Prevention of Metabolic Disorders Research Center, Research Institute for Endocrine Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Mehrdad Bagherpour-Kalo
- Department of Epidemiology and Biostatistics, School of Public health, Tehran University of Medical Sciences, Tehran, Iran
| | - Leila Janani
- Department of Biostatistics, School of Public Health, Iran University of Medical Sciences, Tehran, Iran.
- Imperial Clinical Trials Unit, School of Public Health, Imperial College London, London, UK.
| |
Collapse
|
2
|
Sarica A, Aracri F, Bianco MG, Arcuri F, Quattrone A, Quattrone A. Explainability of random survival forests in predicting conversion risk from mild cognitive impairment to Alzheimer's disease. Brain Inform 2023; 10:31. [PMID: 37979033 PMCID: PMC10657350 DOI: 10.1186/s40708-023-00211-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Accepted: 11/01/2023] [Indexed: 11/19/2023] Open
Abstract
Random Survival Forests (RSF) has recently showed better performance than statistical survival methods as Cox proportional hazard (CPH) in predicting conversion risk from mild cognitive impairment (MCI) to Alzheimer's disease (AD). However, RSF application in real-world clinical setting is still limited due to its black-box nature.For this reason, we aimed at providing a comprehensive study of RSF explainability with SHapley Additive exPlanations (SHAP) on biomarkers of stable and progressive patients (sMCI and pMCI) from Alzheimer's Disease Neuroimaging Initiative. We evaluated three global explanations-RSF feature importance, permutation importance and SHAP importance-and we quantitatively compared them with Rank-Biased Overlap (RBO). Moreover, we assessed whether multicollinearity among variables may perturb SHAP outcome. Lastly, we stratified pMCI test patients in high, medium and low risk grade, to investigate individual SHAP explanation of one pMCI patient per risk group.We confirmed that RSF had higher accuracy (0.890) than CPH (0.819), and its stability and robustness was demonstrated by high overlap (RBO > 90%) between feature rankings within first eight features. SHAP local explanations with and without correlated variables had no substantial difference, showing that multicollinearity did not alter the model. FDG, ABETA42 and HCI were the first important features in global explanations, with the highest contribution also in local explanation. FAQ, mPACCdigit, mPACCtrailsB and RAVLT immediate had the highest influence among all clinical and neuropsychological assessments in increasing progression risk, as particularly evident in pMCI patients' individual explanation. In conclusion, our findings suggest that RSF represents a useful tool to support clinicians in estimating conversion-to-AD risk and that SHAP explainer boosts its clinical utility with intelligible and interpretable individual outcomes that highlights key features associated with AD prognosis.
Collapse
Affiliation(s)
- Alessia Sarica
- Neuroscience Research Center, Department of Medical and Surgical Sciences, Magna Graecia University, viale Europa, loc. Germaneto, 88100, Catanzaro, Italy.
| | - Federica Aracri
- Neuroscience Research Center, Department of Medical and Surgical Sciences, Magna Graecia University, viale Europa, loc. Germaneto, 88100, Catanzaro, Italy
| | - Maria Giovanna Bianco
- Neuroscience Research Center, Department of Medical and Surgical Sciences, Magna Graecia University, viale Europa, loc. Germaneto, 88100, Catanzaro, Italy
| | - Fulvia Arcuri
- Neuroscience Research Center, Department of Medical and Surgical Sciences, Magna Graecia University, viale Europa, loc. Germaneto, 88100, Catanzaro, Italy
| | - Andrea Quattrone
- Neuroscience Research Center, Department of Medical and Surgical Sciences, Magna Graecia University, viale Europa, loc. Germaneto, 88100, Catanzaro, Italy
| | - Aldo Quattrone
- Neuroscience Research Center, Department of Medical and Surgical Sciences, Magna Graecia University, viale Europa, loc. Germaneto, 88100, Catanzaro, Italy
| |
Collapse
|
3
|
Kar İ, Kocaman G, İbrahimov F, Enön S, Coşgun E, Elhan AH. Comparison of deep learning-based recurrence-free survival with random survival forest and Cox proportional hazard models in Stage-I NSCLC patients. Cancer Med 2023; 12:19272-19278. [PMID: 37644818 PMCID: PMC10557877 DOI: 10.1002/cam4.6479] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Revised: 08/10/2023] [Accepted: 08/16/2023] [Indexed: 08/31/2023] Open
Abstract
BACKGROUND The curative treatment for Stage I non-small cell lung cancer (NSCLC) is surgical resection. Even for Stage I patients, the probability of recurrence after curative treatment is around 20%. METHODS In this retrospective study, we included 268 operated Stage I NSCLC patients between January 2008 and June 2018 to analyze the prognostic factors (pathological stage, histological type, number of sampled mediastinal lymph node stations, type of resection, SUVmax of the lesion) that may affect relapse with three different methods, Cox proportional hazard (CoxPH), random survival forest (RSF), DeepSurv, and to compare the performance of these methods with Harrell's C-index. The dataset was randomly split into two sets, training and test sets. RESULTS In the training set, DeepSurv showed the best performance among the three models, the C-index of the training set was 0.832, followed by RSF (0.675) and CoxPH (0.672). In the test set, RSF showed the best performance among the three models, followed by DeepSurv with 0.677 and CoxPH methods with 0.625. CONCLUSION In conclusion, machine-learning techniques can be useful in predicting recurrence for lung cancer and guide clinicians both in choosing the adjuvant treatment options and best follow-up programs.
Collapse
Affiliation(s)
- İrem Kar
- Department of BiostatisticsAnkara University School of MedicineAnkaraTurkey
| | - Gökhan Kocaman
- Department of Thoracic SurgeryAnkara University School of MedicineAnkaraTurkey
| | - Farrukh İbrahimov
- Department of Thoracic SurgeryAnkara University School of MedicineAnkaraTurkey
| | - Serkan Enön
- Department of Thoracic SurgeryAnkara University School of MedicineAnkaraTurkey
| | - Erdal Coşgun
- Genomics Team, Microsoft Research & AIRedmondWashingtonUSA
| | - Atilla Halil Elhan
- Department of BiostatisticsAnkara University School of MedicineAnkaraTurkey
| |
Collapse
|
4
|
Su CL, Chiou SH, Lin FC, Platt RW. Analysis of survival data with cure fraction and variable selection: A pseudo-observations approach. Stat Methods Med Res 2022; 31:2037-2053. [PMID: 35754373 PMCID: PMC9660265 DOI: 10.1177/09622802221108579] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
Abstract
In biomedical studies, survival data with a cure fraction (the proportion of
subjects cured of disease) are commonly encountered. The mixture cure and
bounded cumulative hazard models are two main types of cure fraction models when
analyzing survival data with long-term survivors. In this article, in the
framework of the Cox proportional hazards mixture cure model and bounded
cumulative hazard model, we propose several estimators utilizing
pseudo-observations to assess the effects of covariates on the cure rate and the
risk of having the event of interest for survival data with a cure fraction. A
variable selection procedure is also presented based on the pseudo-observations
using penalized generalized estimating equations for proportional hazards
mixture cure and bounded cumulative hazard models. Extensive simulation studies
are conducted to examine the proposed methods. The proposed technique is
demonstrated through applications to a melanoma study and a dental data set with
high-dimensional covariates.
Collapse
Affiliation(s)
- Chien-Lin Su
- Department of Epidemiology, Biostatistics and Occupational Health, 5620McGill University, Montréal, Québec, Canada.,Centre for Clinical Epidemiology, Lady Davis Institute, Jewish General Hospital, Montréal, Québec, Canada.,Peri and Post Approval Studies, Strategic and Scientific Affairs, PPD, part of Thermo Fisher Scientific, Montréal, Québec, Canada
| | - Sy Han Chiou
- Department of Mathematical Sciences, University of Texas at Dallas, Richardson, TX, USA
| | - Feng-Chang Lin
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC, USA
| | - Robert W Platt
- Department of Epidemiology, Biostatistics and Occupational Health, 5620McGill University, Montréal, Québec, Canada.,Centre for Clinical Epidemiology, Lady Davis Institute, Jewish General Hospital, Montréal, Québec, Canada
| |
Collapse
|
5
|
Gelcho GN, Bekele MG. Modeling Time to Cure of Deep Vein Thrombosis Using Cox Proportional Model in Southwest of Ethiopia. Ethiop J Health Sci 2022; 32:555-562. [PMID: 35813688 PMCID: PMC9214733 DOI: 10.4314/ejhs.v32i3.11] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2021] [Accepted: 02/28/2022] [Indexed: 11/20/2022] Open
Abstract
Background Globally, there are about 10 million cases of deep vein thrombosis every year, and it is the third leading cardiovascular disease after myocardial infarction and stroke. The objective of the study is to assess risk factors of time to cure patients of deep vein thrombosis in southwest Ethiopia. Methods A retrospective cohort study design was used. The study population was deep vein thrombosis patients at purposively selected hospitals in Southwest Ethiopia from January 2017 to December 2020. Cox proportional hazard model was used to identify risk factors associated with deep vein thrombosis. Results Out of the total 1068 registered as deep vein thrombosis patients, 263(24.6%) were cured during the study period, and 805(75.4%) were censored. Results of the Cox proportional hazard model show that; age, gender, family history of deep vein thrombosis, smoking status, immobilize and alcohol consumption were factors associated with deep vein thrombosis (p-value<0.05). Conclusion The patients with a family history of deep vein thrombosis, prolonged immobilization, greater the 50 years, smoking cigarettes, female (non-pregnant) and alcohol users had a longer curing time of deep vein thrombosis compared to others.
Collapse
Affiliation(s)
- Gurmessa Nugussu Gelcho
- Department of Statistics, College of Natural Sciences, Jimma University, Jimma, Oromia, Ethiopia
| | - Mosisa Girma Bekele
- Department of Statistics, College of Natural Sciences, Jimma University, Jimma, Oromia, Ethiopia
| |
Collapse
|
6
|
Kamphorst B, Rooijakkers T, Veugen T, Cellamare M, Knoors D. Accurate training of the Cox proportional hazards model on vertically-partitioned data while preserving privacy. BMC Med Inform Decis Mak 2022; 22:49. [PMID: 35209883 PMCID: PMC8867891 DOI: 10.1186/s12911-022-01771-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2021] [Accepted: 01/20/2022] [Indexed: 11/10/2022] Open
Abstract
Background Analysing distributed medical data is challenging because of data sensitivity and various regulations to access and combine data. Some privacy-preserving methods are known for analyzing horizontally-partitioned data, where different organisations have similar data on disjoint sets of people. Technically more challenging is the case of vertically-partitioned data, dealing with data on overlapping sets of people. We use an emerging technology based on cryptographic techniques called secure multi-party computation (MPC), and apply it to perform privacy-preserving survival analysis on vertically-distributed data by means of the Cox proportional hazards (CPH) model. Both MPC and CPH are explained. Methods We use a Newton-Raphson solver to securely train the CPH model with MPC, jointly with all data holders, without revealing any sensitive data. In order to securely compute the log-partial likelihood in each iteration, we run into several technical challenges to preserve the efficiency and security of our solution. To tackle these technical challenges, we generalize a cryptographic protocol for securely computing the inverse of the Hessian matrix and develop a new method for securely computing exponentiations. A theoretical complexity estimate is given to get insight into the computational and communication effort that is needed. Results Our secure solution is implemented in a setting with three different machines, each presenting a different data holder, which can communicate through the internet. The MPyC platform is used for implementing this privacy-preserving solution to obtain the CPH model. We test the accuracy and computation time of our methods on three standard benchmark survival datasets. We identify future work to make our solution more efficient. Conclusions Our secure solution is comparable with the standard, non-secure solver in terms of accuracy and convergence speed. The computation time is considerably larger, although the theoretical complexity is still cubic in the number of covariates and quadratic in the number of subjects. We conclude that this is a promising way of performing parametric survival analysis on vertically-distributed medical data, while realising high level of security and privacy.
Collapse
Affiliation(s)
- Bart Kamphorst
- Cyber Security and Robustness, Netherlands Organisation for Applied Scientific Research, The Hague, The Netherlands.
| | - Thomas Rooijakkers
- Cyber Security and Robustness, Netherlands Organisation for Applied Scientific Research, The Hague, The Netherlands
| | - Thijs Veugen
- Cyber Security and Robustness, Netherlands Organisation for Applied Scientific Research, The Hague, The Netherlands.,Cryptology, Centrum Wiskunde and Informatica, Amsterdam, The Netherlands
| | - Matteo Cellamare
- Research and Development, Netherlands Comprehensive Cancer Organisation, Eindhoven, The Netherlands
| | - Daan Knoors
- Research and Development, Netherlands Comprehensive Cancer Organisation, Eindhoven, The Netherlands
| |
Collapse
|
7
|
Alqahtani K, Taylor CC, Wood HM, Gusnanto A. Sparse modelling of cancer patients' survival based on genomic copy number alterations. J Biomed Inform 2022; 128:104025. [PMID: 35181494 DOI: 10.1016/j.jbi.2022.104025] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Revised: 02/03/2022] [Accepted: 02/05/2022] [Indexed: 11/24/2022]
Abstract
Copy number alterations (CNA) are structural variation in the genome, in which some regions exhibit more or less than the normal two chromosomal copies. This genomic CNA profile provides critical information in tumour progression and is therefore informative for patients' survival. It is currently a statistical challenge to model patients' survival using their genomic CNA profiles while at the same time identify regions in the genome that are associated with patients' survival. Some methods have been proposed, including Cox proportional hazard (PH) model with ridge, lasso, or elastic net penalties. However, these methods do not take the general dependencies between genomic regions into account and produce results that are difficult to interpret. In this paper, we extend the elastic net penalty by introducing additional penalty that takes into account general dependencies between genomic regions. This new model produces smooth parameter estimates while simultaneously performs variable selection via sparse solution. The results indicate that the proposed method shows a better prediction performance than other models in our simulation study, while enabling us to investigate regions in the genome that are associated with the patients' survival with sensible interpretation. We illustrate the method using a real dataset from a lung cancer cohort and simulated data.
Collapse
Affiliation(s)
- Khaled Alqahtani
- Department of Mathematics, College of Science and Humanitarian Studies, Prince Sattam Bin Abdulaziz University, Al Kharj, Saudi Arabia; Department of Statistics, University of Leeds, Leeds LS2 9JT, United Kingdom
| | - Charles C Taylor
- Department of Statistics, University of Leeds, Leeds LS2 9JT, United Kingdom
| | - Henry M Wood
- Leeds Institute of Medical Research at St. James's, University of Leeds, Leeds LS9 7TF
| | - Arief Gusnanto
- Department of Statistics, University of Leeds, Leeds LS2 9JT, United Kingdom
| |
Collapse
|
8
|
Yang Y, Ma X, Wang Y, Ding X. Prognosis prediction of extremity and trunk wall soft-tissue sarcomas treated with surgical resection with radiomic analysis based on random survival forest. Updates Surg 2021; 74:355-365. [PMID: 34003477 DOI: 10.1007/s13304-021-01074-8] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Accepted: 04/29/2021] [Indexed: 02/05/2023]
Abstract
Many researches have applied machine learning methods to find associations between radiomic features and clinical outcomes. Random survival forests (RSF), as an accurate classifier, sort all candidate variables as the rank of importance values. There was no study concerning on finding radiomic predictors in patients with extremity and trunk wall soft-tissue sarcomas using RSF. This study aimed to determine associations between radiomic features and overall survival (OS) by RSF analysis. To identify radiomic features with important values by RSF analysis, construct predictive models for OS incorporating clinical characteristics, and evaluate models' performance with different method. We collected clinical characteristics and radiomic features extracted from plain and contrast-enhanced computed tomography (CT) from 353 patients with extremity and trunk wall soft-tissue sarcomas treated with surgical resection. All radiomic features were analyzed by Cox proportional hazard (CPH) and followed RSF analysis. The association between radiomics-predicted risks and OS was assessed by Kaplan-Meier analysis. All clinical features were screened by CPH analysis. Prognostic clinical and radiomic parameters were fitted into RSF and CPH integrative models for OS in the training cohort, respectively. The concordance indexes (C-index) and Brier scores of both two models were evaluated in both training and testing cohorts. The model with better predictive performance was interpreted with nomogram and calibration plots. Among all 86 radiomic features, there were three variables selected with high importance values. The RSF on these three features distinguished patients with high predicted risks from patients with low predicted risks for OS in the training set (P < 0.001) using Kaplan-Meier analysis. Age, lymph node involvement and grade were incorporated into the combined models for OS (P < 0.05). The C-indexes in both two integrative models fluctuated above 0.80 whose Brier scores maintained less than 15.0 in the training and testing datasets. The RSF model performed little advantages over the CPH model that the calibration curve of the RSF model showed favorable agreement between predicted and actual survival probabilities for the 3-year and 5-year survival prediction. The multimodality RSF model including clinical and radiomic characteristics conducted high capacity in prediction of OS which might assist individualized therapeutic regimens. Level III, prognostic study.
Collapse
Affiliation(s)
- Yuhan Yang
- West China School of Medicine, Sichuan University, No.17 People's South Road, Chengdu, 610041, Sichuan, China
| | - Xuelei Ma
- State Key Laboratory of Biotherapy, Department of Biotherapy and Cancer Center, West China Hospital, Sichuan University, Guoxue Road, Chengdu, 610041, China.
| | - Yixi Wang
- West China School of Medicine, Sichuan University, No.17 People's South Road, Chengdu, 610041, Sichuan, China
| | - Xinyan Ding
- West China School of Medicine, Sichuan University, No.17 People's South Road, Chengdu, 610041, Sichuan, China
| |
Collapse
|
9
|
Hassan A, De Gruttola V, Hu YW, Sheng Z, Poortinga K, Wertheim JO. The Relationship Between the Human Immunodeficiency Virus-1 Transmission Network and the HIV Care Continuum in Los Angeles County. Clin Infect Dis 2020; 71:e384-e391. [PMID: 32020172 PMCID: PMC7904072 DOI: 10.1093/cid/ciaa114] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2019] [Accepted: 02/03/2020] [Indexed: 12/22/2022] Open
Abstract
BACKGROUND Public health action combating human immunodeficiency virus (HIV) includes facilitating navigation through the HIV continuum of care: timely diagnosis followed by linkage to care and initiation of antiretroviral therapy to suppress viral replication. Molecular epidemiology can identify rapidly growing HIV genetic transmission clusters. How progression through the care continuum relates to transmission clusters has not been previously characterized. METHODS We performed a retrospective study on HIV surveillance data from 5226 adult cases in Los Angeles County diagnosed from 2010 through 2014. Genetic transmission clusters were constructed using HIV-TRACE. Cox proportional hazard models were used to estimate the impact of transmission cluster growth on the time intervals between care continuum events. Gamma frailty models incorporated the effect of heterogeneity associated with genetic transmission clusters. RESULTS In contrast to our expectations, there were no differences in time to the care continuum events among individuals in clusters with different growth dynamics. However, upon achieving viral suppression, individuals in high growth clusters were slower to experience viral rebound (hazard ratio 0.83, P = .011) compared with individuals in low growth clusters. Heterogeneity associated with cluster membership in the timing to each event in the care continuum was highly significant (P < .001), with and without adjustment for transmission risk and demographics. CONCLUSIONS Individuals within the same transmission cluster have more similar trajectories through the HIV care continuum than those across transmission clusters. These findings suggest molecular epidemiology can assist public health officials in identifying clusters of individuals who may benefit from assistance in navigating the HIV care continuum.
Collapse
Affiliation(s)
- Adiba Hassan
- Department of Medicine, University of California, San Diego, California, USA
| | - Victor De Gruttola
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
- Department of Family Medicine, University of California, San Diego, California, USA
| | - Yunyin W Hu
- Division of HIV and STD Programs, Los Angeles County Department of Public Health, Los Angeles, California, USA
| | - Zhijuan Sheng
- Division of HIV and STD Programs, Los Angeles County Department of Public Health, Los Angeles, California, USA
| | - Kathleen Poortinga
- Division of HIV and STD Programs, Los Angeles County Department of Public Health, Los Angeles, California, USA
| | - Joel O Wertheim
- Department of Medicine, University of California, San Diego, California, USA
| |
Collapse
|
10
|
Beaulac C, Rosenthal JS, Pei Q, Friedman D, Wolden S, Hodgson D. An evaluation of machine learning techniques to predict the outcome of children treated for Hodgkin-Lymphoma on the AHOD0031 trial: A report from the Children's Oncology Group. Appl Artif Intell 2020; 34:1100-1114. [PMID: 33731974 PMCID: PMC7963212 DOI: 10.1080/08839514.2020.1815151] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
In this manuscript we analyze a data set containing information on children with Hodgkin Lymphoma (HL) enrolled on a clinical trial. Treatments received and survival status were collected together with other covariates such as demographics and clinical measurements. Our main task is to explore the potential of machine learning (ML) algorithms in a survival analysis context in order to improve over the Cox Proportional Hazard (CoxPH) model. We discuss the weaknesses of the CoxPH model we would like to improve upon and then we introduce multiple algorithms, from well-established ones to state-of-the-art models, that solve these issues. We then compare every model according to the concordance index and the brier score. Finally, we produce a series of recommendations, based on our experience, for practitioners that would like to benefit from the recent advances in artificial intelligence.
Collapse
Affiliation(s)
- Cédric Beaulac
- Department of Statistical Sciences, University of Toronto, Toronto, Canada
| | | | - Qinglin Pei
- Department of Biostatistics, University of Florida, Gainesville, USA
| | - Debra Friedman
- Department of Pediatrics, Vanderbilt University, Nashville, USA
| | - Suzanne Wolden
- Department of Radiation Oncology, Memorial Sloan Kettering Cancer Center, New York, USA
| | - David Hodgson
- Department of Radiation Oncology, University of Toronto, Toronto, Canada
| |
Collapse
|
11
|
Carney G, Bassett K, Wright JM, Maclure M, McGuire N, Dormuth CR. Comparison of cholinesterase inhibitor safety in real-world practice. Alzheimers Dement (N Y) 2019; 5:732-739. [PMID: 31921965 PMCID: PMC6944712 DOI: 10.1016/j.trci.2019.09.011] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Introduction Cholinesterase inhibitors (ChEIs) are widely used to treat mild to moderate Alzheimer's disease and related dementia. Clinical trials have focused on placebo comparisons, inadequately addressing within-class comparative safety. Methods New users of ChEIs in British Columbia were categorized into five study cohorts: low-dose donepezil, high-dose donepezil, galantamine, rivastigmine patch, and oral rivastigmine. Comparative safety of ChEIs assessed hazard ratios using propensity score adjusted Cox regression. Results Compared with low-dose donepezil, galantamine use was associated with a lower risk of mortality (adjusted hazard ratio: 0.84, 95% confidence interval: 0.60–1.18), cardiovascular serious adverse events (adjusted hazard ratio: 0.78, 95% confidence interval: 0.62–0.98), and entry into a residential care facility (adjusted hazard ratio: 0.72, 95% confidence interval: 0.59–0.89). Discussion Given the absence of randomized trial data showing clinically meaningful benefit of ChEI therapy in Alzheimer's disease, our study suggests preferential use of galantamine may at least be associated with fewer adverse events than treatment with donepezil or rivastigmine. Galantamine was associated with fewer adverse events than donepezil or rivastigmine. Galantamine users experienced longer independent living. The 3-year risk of cardiovascular events and mortality was lowest with galantamine.
Collapse
Affiliation(s)
- Greg Carney
- Therapeutics Initiative, University of British Columbia, Vancouver, BC, Canada.,Department of Anesthesiology, Pharmacology and Therapeutics, University of British Columbia, Vancouver, BC, Canada
| | - Ken Bassett
- Therapeutics Initiative, University of British Columbia, Vancouver, BC, Canada.,Department of Anesthesiology, Pharmacology and Therapeutics, University of British Columbia, Vancouver, BC, Canada.,Department of Family Practice, University of British Columbia, Vancouver, BC, Canada
| | - James M Wright
- Therapeutics Initiative, University of British Columbia, Vancouver, BC, Canada.,Department of Anesthesiology, Pharmacology and Therapeutics, University of British Columbia, Vancouver, BC, Canada.,Department of Medicine, University of British Columbia, Vancouver, BC, Canada
| | - Malcolm Maclure
- Therapeutics Initiative, University of British Columbia, Vancouver, BC, Canada.,Department of Anesthesiology, Pharmacology and Therapeutics, University of British Columbia, Vancouver, BC, Canada
| | - Nicolette McGuire
- Research and Innovation Division, B.C. Ministry of Health, Victoria, BC, Canada
| | - Colin R Dormuth
- Therapeutics Initiative, University of British Columbia, Vancouver, BC, Canada.,Department of Anesthesiology, Pharmacology and Therapeutics, University of British Columbia, Vancouver, BC, Canada
| |
Collapse
|
12
|
Matsuo K, Purushotham S, Jiang B, Mandelbaum RS, Takiuchi T, Liu Y, Roman LD. Survival outcome prediction in cervical cancer: Cox models vs deep-learning model. Am J Obstet Gynecol 2019; 220:381.e1-381.e14. [PMID: 30582927 DOI: 10.1016/j.ajog.2018.12.030] [Citation(s) in RCA: 52] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2018] [Revised: 12/06/2018] [Accepted: 12/17/2018] [Indexed: 01/20/2023]
Abstract
BACKGROUND Historically, the Cox proportional hazard regression model has been the mainstay for survival analyses in oncologic research. The Cox proportional hazard regression model generally is used based on an assumption of linear association. However, it is likely that, in reality, there are many clinicopathologic features that exhibit a nonlinear association in biomedicine. OBJECTIVE The purpose of this study was to compare the deep-learning neural network model and the Cox proportional hazard regression model in the prediction of survival in women with cervical cancer. STUDY DESIGN This was a retrospective pilot study of consecutive cases of newly diagnosed stage I-IV cervical cancer from 2000-2014. A total of 40 features that included patient demographics, vital signs, laboratory test results, tumor characteristics, and treatment types were assessed for analysis and grouped into 3 feature sets. The deep-learning neural network model was compared with the Cox proportional hazard regression model and 3 other survival analysis models for progression-free survival and overall survival. Mean absolute error and concordance index were used to assess the performance of these 5 models. RESULTS There were 768 women included in the analysis. The median age was 49 years, and the majority were Hispanic (71.7%). The majority of tumors were squamous (75.3%) and stage I (48.7%). The median follow-up time was 40.2 months; there were 241 events for recurrence and progression and 170 deaths during the follow-up period. The deep-learning model showed promising results in the prediction of progression-free survival when compared with the Cox proportional hazard regression model (mean absolute error, 29.3 vs 316.2). The deep-learning model also outperformed all the other models, including the Cox proportional hazard regression model, for overall survival (mean absolute error, Cox proportional hazard regression vs deep-learning, 43.6 vs 30.7). The performance of the deep-learning model further improved when more features were included (concordance index for progression-free survival: 0.695 for 20 features, 0.787 for 36 features, and 0.795 for 40 features). There were 10 features for progression-free survival and 3 features for overall survival that demonstrated significance only in the deep-learning model, but not in the Cox proportional hazard regression model. There were no features for progression-free survival and 3 features for overall survival that demonstrated significance only in the Cox proportional hazard regression model, but not in the deep-learning model. CONCLUSION Our study suggests that the deep-learning neural network model may be a useful analytic tool for survival prediction in women with cervical cancer because it exhibited superior performance compared with the Cox proportional hazard regression model. This novel analytic approach may provide clinicians with meaningful survival information that potentially could be integrated into treatment decision-making and planning. Further validation studies are necessary to support this pilot study.
Collapse
Affiliation(s)
- Koji Matsuo
- Division of Gynecologic Oncology, Department of Obstetrics and Gynecology, University of Southern California, Los Angeles, CA; Norris Comprehensive Cancer Center, University of Southern California, Los Angeles, CA.
| | - Sanjay Purushotham
- Department of Computer Science, University of Southern California, Los Angeles, CA
| | - Bo Jiang
- Department of Computer Science, University of Southern California, Los Angeles, CA
| | - Rachel S Mandelbaum
- Division of Gynecologic Oncology, Department of Obstetrics and Gynecology, University of Southern California, Los Angeles, CA
| | - Tsuyoshi Takiuchi
- Division of Gynecologic Oncology, Department of Obstetrics and Gynecology, University of Southern California, Los Angeles, CA
| | - Yan Liu
- Department of Computer Science, University of Southern California, Los Angeles, CA
| | - Lynda D Roman
- Norris Comprehensive Cancer Center, University of Southern California, Los Angeles, CA
| |
Collapse
|
13
|
Van der Mussele S, Fransen E, Struyfs H, Luyckx J, Mariën P, Saerens J, Somers N, Goeman J, De Deyn PP, Engelborghs S. Depression in mild cognitive impairment is associated with progression to Alzheimer's disease: a longitudinal study. J Alzheimers Dis 2015; 42:1239-50. [PMID: 25024328 DOI: 10.3233/jad-140405] [Citation(s) in RCA: 88] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
BACKGROUND Behavioral and psychological signs and symptoms of dementia (BPSD) belong to the core symptoms of dementia and are also common in mild cognitive impairment (MCI). OBJECTIVE This study would like to contribute to the understanding of the prognostic role of BPSD in MCI for the progression to dementia due to Alzheimer's disease (AD). METHODS Data were generated through an ongoing prospective longitudinal study on BPSD. Assessment was performed by means of the Middelheim Frontality Score, Behave-AD, Cohen-Mansfield Agitation Inventory, Cornell Scale for Depression in Dementia (CSDD), and Geriatric Depression Scale 30-questions (GDS-30). Cox proportional hazard models were used to test the hypothesis that certain BPSD in MCI are predictors of developing AD. RESULTS The study population consisted of 183 MCI patients at baseline. At follow-up, 74 patients were stable and 109 patients progressed to AD. The presence of significant depressive symptoms in MCI as measured by the CSDD (HR: 2.06; 95% CI: 1.23-3.44; p = 0.011) and the GDS-30 (HR: 1.77; 95% CI: 1.10-2.85; p = 0.025) were associated with progression to AD. The severity of depressive symptoms as measured by the GDS-30 was a predictor for progression too (HR: 1.06; 95% CI: 1.01-1.11; p = 0.020). Furthermore, the severity of agitated behavior, especially verbal agitation and the presence of purposeless activity, was also associated with progression, whereas diurnal rhythm disturbances were associated with no progression to AD. CONCLUSION Depressive symptoms in MCI appear to be predictors for progression to AD.
Collapse
Affiliation(s)
- Stefan Van der Mussele
- Reference Center for Biological Markers of Dementia (BIODEM), Laboratory of Neurochemistry and Behavior, Institute Born-Bunge, University of Antwerp, Antwerp, Belgium Department of Nursing and Midwifery Sciences, Faculty of Medicine and Health Sciences, University of Antwerp, Antwerp, Belgium
| | - Erik Fransen
- StatUa Center for Statistics, University of Antwerp, Antwerp, Belgium
| | - Hanne Struyfs
- Reference Center for Biological Markers of Dementia (BIODEM), Laboratory of Neurochemistry and Behavior, Institute Born-Bunge, University of Antwerp, Antwerp, Belgium
| | - Jill Luyckx
- Reference Center for Biological Markers of Dementia (BIODEM), Laboratory of Neurochemistry and Behavior, Institute Born-Bunge, University of Antwerp, Antwerp, Belgium
| | - Peter Mariën
- Department of Neurology and Memory Clinic, Hospital Network Antwerp (ZNA), Middelheim and Hoge Beuken, Antwerp, Belgium Department of Clinical and Experimental Neurolinguistics (CLIN), Vrije Universiteit Brussel, Brussels, Belgium
| | - Jos Saerens
- Department of Neurology and Memory Clinic, Hospital Network Antwerp (ZNA), Middelheim and Hoge Beuken, Antwerp, Belgium
| | - Nore Somers
- Department of Neurology and Memory Clinic, Hospital Network Antwerp (ZNA), Middelheim and Hoge Beuken, Antwerp, Belgium
| | - Johan Goeman
- Department of Neurology and Memory Clinic, Hospital Network Antwerp (ZNA), Middelheim and Hoge Beuken, Antwerp, Belgium
| | - Peter P De Deyn
- Reference Center for Biological Markers of Dementia (BIODEM), Laboratory of Neurochemistry and Behavior, Institute Born-Bunge, University of Antwerp, Antwerp, Belgium Department of Neurology and Memory Clinic, Hospital Network Antwerp (ZNA), Middelheim and Hoge Beuken, Antwerp, Belgium Department of Rehabilitation Sciences and Physiotherapy, Faculty of Medicine and Health Sciences, University of Antwerp, Antwerp, Belgium Department of Neurology and Alzheimer Research Center, University Medical Center Groningen, University of Groningen, The Netherlands
| | - Sebastiaan Engelborghs
- Reference Center for Biological Markers of Dementia (BIODEM), Laboratory of Neurochemistry and Behavior, Institute Born-Bunge, University of Antwerp, Antwerp, Belgium Department of Neurology and Memory Clinic, Hospital Network Antwerp (ZNA), Middelheim and Hoge Beuken, Antwerp, Belgium
| |
Collapse
|
14
|
Okyere GA, Alalbil PA, Ping-Naah H, Tifere Y. Determinants of Survival in Adult HIV Clients on Antiretroviral Therapy in Lawra and Jirapa Districts of Upper West Region, Ghana. J Int Assoc Provid AIDS Care 2013; 14:255-60. [PMID: 24344253 DOI: 10.1177/2325957413500531] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
We describe the rate of death and identify the determinants of survival in a cohort of adults starting antiretroviral therapy (ART) in 2 hospitals in Upper West Region, Ghana. Kaplan-Meier model was used to estimate the survival probability after ART initiation and Cox proportional hazard model used to assess the relationship between baseline variables and mortality. A total of 91 clients who were initiated on ART in both hospitals participated in the study. Clients staged in the World Health Organization (WHO) clinical stage III/IV had a higher risk of mortality than those staging I/II (hazard ratio [HR] of 3.93). Hemoglobin value at baseline with a cutoff ≥12 g/dL for women (and ≥13 for men) was strongly associated with mortality in participants with an HR of 3.87 (95% confidence interval [CI]: 0.71-21.19) for severe anemia, 2.11 (95% CI: 0.45-9.93) for moderate anemia, and 0.88 (95% CI: 0.16-4.82) for mild anemia. Anemia and WHO staging were independent predictors of mortality.
Collapse
Affiliation(s)
| | - Paul Awinbil Alalbil
- Department of Mathematics, Kwame Nkrumah University of Science and Technology, Kumasi, Ghana
| | - Henry Ping-Naah
- Department of Mathematics, Kwame Nkrumah University of Science and Technology, Kumasi, Ghana
| | - Yakubu Tifere
- Department of Mathematics, Kwame Nkrumah University of Science and Technology, Kumasi, Ghana
| |
Collapse
|
15
|
Abstract
In clinical trials, it is frequently of interest to estimate the time between the onset of two events (e.g. duration of response in oncology). Here, we consider the case where subjects are assessed at fixed visits but the initial event and the terminating event occur in between visits. This type of data, called doubly interval censored, is often analyzed with standard survival techniques, assuming either that the survival time (between initial and terminating event) is known exactly or is single interval censored. We introduce a motivating dataset in which the interest is to evaluate the impact of the treatment on the duration of response endpoint. We review the existing approaches and discuss their limitations with respect to the characteristics of our motivating dataset. Furthermore, we propose a stochastic EM algorithm that overcomes the problems in the existing approaches. We show by simulations the finite sample properties of our approach.
Collapse
Affiliation(s)
- David Dejardin
- Interuniversity Institute for Biostatistics and Statistical Bioinformatics, KULeuven and Universiteit Hasselt, Kapucijnenvoer 35, Blok D, Bus 7001, B3000 Leuven, Belgium
| | | |
Collapse
|