51
|
Zhang C, Li Z, Yang Z, Huang B, Hou Y, Chen Z. A Dynamic Prediction Model Supporting Individual Life Expectancy Prediction Based on Longitudinal Time-Dependent Covariates. IEEE J Biomed Health Inform 2023; 27:4623-4632. [PMID: 37471185 DOI: 10.1109/jbhi.2023.3292475] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/22/2023]
Abstract
In the field of clinical chronic diseases, common prediction results (such as survival rate) and effect size hazard ratio (HR) are relative indicators, resulting in more abstract information. However, clinicians and patients are more interested in simple and intuitive concepts of (survival) time, such as how long a patient may live or how much longer a patient in a treatment group will live. In addition, due to the long follow-up time, resulting in generation of longitudinal time-dependent covariate information, patients are interested in how long they will survive at each follow-up visit. In this study, based on a time scale indicator-restricted mean survival time (RMST)-we proposed a dynamic RMST prediction model by considering longitudinal time-dependent covariates and utilizing joint model techniques. The model can describe the change trajectory of longitudinal time-dependent covariates and predict the average survival times of patients at different time points (such as follow-up visits). Simulation studies through Monte Carlo cross-validation showed that the dynamic RMST prediction model was superior to the static RMST model. In addition, the dynamic RMST prediction model was applied to a primary biliary cirrhosis (PBC) population to dynamically predict the average survival times of the patients, and the average C-index of the internal validation of the model reached 0.81, which was better than that of the static RMST regression. Therefore, the proposed dynamic RMST prediction model has better performance in prediction and can provide a scientific basis for clinicians and patients to make clinical decisions.
Collapse
|
52
|
Gusev A. Germline mechanisms of immunotherapy toxicities in the era of genome-wide association studies. Immunol Rev 2023; 318:138-156. [PMID: 37515388 PMCID: PMC11472697 DOI: 10.1111/imr.13253] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Accepted: 06/29/2023] [Indexed: 07/30/2023]
Abstract
Cancer immunotherapy has revolutionized the treatment of advanced cancers and is quickly becoming an option for early-stage disease. By reactivating the host immune system, immunotherapy harnesses patients' innate defenses to eradicate the tumor. By putatively similar mechanisms, immunotherapy can also substantially increase the risk of toxicities or immune-related adverse events (irAEs). Severe irAEs can lead to hospitalization, treatment discontinuation, lifelong immune complications, or even death. Many irAEs present with similar symptoms to heritable autoimmune diseases, suggesting that germline genetics may contribute to their onset. Recently, genome-wide association studies (GWAS) of irAEs have identified common germline associations and putative mechanisms, lending support to this hypothesis. A wide range of well-established GWAS methods can potentially be harnessed to understand the etiology of irAEs specifically and immunotherapy outcomes broadly. This review summarizes current findings regarding germline effects on immunotherapy outcomes and discusses opportunities and challenges for leveraging germline genetics to understand, predict, and treat irAEs.
Collapse
Affiliation(s)
- Alexander Gusev
- Division of Population Sciences, Dana-Farber Cancer Institute and Harvard Medical School, Boston, Massachusetts, USA
- Division of Genetics, Brigham & Women's Hospital, Boston, Massachusetts, USA
- The Broad Institute, Cambridge, Massachusetts, USA
| |
Collapse
|
53
|
Chen R, Cai N, Luo Z, Wang H, Liu X, Li J. Multi-task banded regression model: A novel individual survival analysis model for breast cancer. Comput Biol Med 2023; 162:107080. [PMID: 37271111 DOI: 10.1016/j.compbiomed.2023.107080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2022] [Revised: 04/11/2023] [Accepted: 05/27/2023] [Indexed: 06/06/2023]
Abstract
PURPOSE To reveal the hazard probability of individual breast cancer patients, a multi-task banded regression model is proposed for individual survival analysis of breast cancer. METHODS A banded verification matrix is designed to construct the response transform function of the proposed multi-task banded regression model, which can solve the repeated switching of survival rate. A martingale process is introduced to construct different nonlinear regressions for different survival subintervals. The concordance index (C-index) is used to compare the proposed model with Cox proportional hazards (CoxPH) models and previous multi-task regression models. RESULTS Two commonly-used breast cancer datasets are employed to validate the proposed model. Specifically, the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) includes 1981 breast cancer patients, of which 57.7% died of breast cancer. The Rotterdam & German Breast Cancer Study Group (GBSG) includes 1546 patients with lymph node-positive breast cancer in a randomized clinical trial, of which 44.4% died. Experimental results indicate that the proposed model is superior to some existing models for overall and individual survival analysis of breast cancer, with the C-index of 0.6786 for the GBSG and 0.6701 for the METABRIC. CONCLUSION The superiority of the proposed model can be contributed to three novel ideas. One is that a banded verification matrix can band the response of the survival process. Second, the martingale process can construct different nonlinear regressions for different survival subintervals. Third, the novel loss can adapt the model to making the multi-task regression similar to the real survival process.
Collapse
Affiliation(s)
- Rui Chen
- School of Information Engineering, Guangdong University of Technology, Guangzhou, China
| | - Nian Cai
- School of Information Engineering, Guangdong University of Technology, Guangzhou, China.
| | - Zhihao Luo
- School of Information Engineering, Guangdong University of Technology, Guangzhou, China
| | - Huiheng Wang
- School of Information Engineering, Guangdong University of Technology, Guangzhou, China
| | - Xuan Liu
- Department of Ultrasound, Sun Yat-Sen University Cancer Center, State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Guangzhou, China
| | - Jian Li
- Department of Ultrasound, Sun Yat-Sen University Cancer Center, State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Guangzhou, China.
| |
Collapse
|
54
|
Eskofier BM, Klucken J. Predictive Models for Health Deterioration: Understanding Disease Pathways for Personalized Medicine. Annu Rev Biomed Eng 2023; 25:131-156. [PMID: 36854259 DOI: 10.1146/annurev-bioeng-110220-030247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/02/2023]
Abstract
Artificial intelligence (AI) and machine learning (ML) methods are currently widely employed in medicine and healthcare. A PubMed search returns more than 100,000 articles on these topics published between 2018 and 2022 alone. Notwithstanding several recent reviews in various subfields of AI and ML in medicine, we have yet to see a comprehensive review around the methods' use in longitudinal analysis and prediction of an individual patient's health status within a personalized disease pathway. This review seeks to fill that gap. After an overview of the AI and ML methods employed in this field and of specific medical applications of models of this type, the review discusses the strengths and limitations of current studies and looks ahead to future strands of research in this field. We aim to enable interested readers to gain a detailed impression of the research currently available and accordingly plan future work around predictive models for deterioration in health status.
Collapse
Affiliation(s)
- Bjoern M Eskofier
- Machine Learning and Data Analytics Lab, Department of Artificial Intelligence in Biomedical Engineering, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany;
| | - Jochen Klucken
- Digital Medicine Group, Luxembourg Centre for Systems Biomedicine, Université du Luxembourg, Belvaux, Luxembourg
- Digital Medicine Group, Department of Precision Health, Luxembourg Institute of Health, Strassen, Luxembourg
- Centre Hospitalier de Luxembourg, Luxembourg City, Luxembourg
| |
Collapse
|
55
|
Salerno S, Li Y. High-Dimensional Survival Analysis: Methods and Applications. ANNUAL REVIEW OF STATISTICS AND ITS APPLICATION 2023; 10:25-49. [PMID: 36968638 PMCID: PMC10038209 DOI: 10.1146/annurev-statistics-032921-022127] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
In the era of precision medicine, time-to-event outcomes such as time to death or progression are routinely collected, along with high-throughput covariates. These high-dimensional data defy classical survival regression models, which are either infeasible to fit or likely to incur low predictability due to over-fitting. To overcome this, recent emphasis has been placed on developing novel approaches for feature selection and survival prognostication. We will review various cutting-edge methods that handle survival outcome data with high-dimensional predictors, highlighting recent innovations in machine learning approaches for survival prediction. We will cover the statistical intuitions and principles behind these methods and conclude with extensions to more complex settings, where competing events are observed. We exemplify these methods with applications to the Boston Lung Cancer Survival Cohort study, one of the largest cancer epidemiology cohorts investigating the complex mechanisms of lung cancer.
Collapse
Affiliation(s)
- Stephen Salerno
- Department of Biostatistics, University of Michigan, Ann Arbor, United States, 48109
| | - Yi Li
- Department of Biostatistics, University of Michigan, Ann Arbor, United States, 48109
| |
Collapse
|
56
|
Deep-learning-based prognostic modeling for incident heart failure in patients with diabetes using electronic health records: A retrospective cohort study. PLoS One 2023; 18:e0281878. [PMID: 36809251 PMCID: PMC9943005 DOI: 10.1371/journal.pone.0281878] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Accepted: 02/02/2023] [Indexed: 02/23/2023] Open
Abstract
Patients with type 2 diabetes mellitus (T2DM) have more than twice the risk of developing heart failure (HF) compared to patients without diabetes. The present study is aimed to build an artificial intelligence (AI) prognostic model that takes in account a large and heterogeneous set of clinical factors and investigates the risk of developing HF in diabetic patients. We carried out an electronic health records- (EHR-) based retrospective cohort study that included patients with cardiological clinical evaluation and no previous diagnosis of HF. Information consists of features extracted from clinical and administrative data obtained as part of routine medical care. The primary endpoint was diagnosis of HF (during out-of-hospital clinical examination or hospitalization). We developed two prognostic models using (1) elastic net regularization for Cox proportional hazard model (COX) and (2) a deep neural network survival method (PHNN), in which a neural network was used to represent a non-linear hazard function and explainability strategies are applied to estimate the influence of predictors on the risk function. Over a median follow-up of 65 months, 17.3% of the 10,614 patients developed HF. The PHNN model outperformed COX both in terms of discrimination (c-index 0.768 vs 0.734) and calibration (2-year integrated calibration index 0.008 vs 0.018). The AI approach led to the identification of 20 predictors of different domains (age, body mass index, echocardiographic and electrocardiographic features, laboratory measurements, comorbidities, therapies) whose relationship with the predicted risk correspond to known trends in the clinical practice. Our results suggest that prognostic models for HF in diabetic patients may improve using EHRs in combination with AI techniques for survival analysis, which provide high flexibility and better performance with respect to standard approaches.
Collapse
|
57
|
Marthin P, Tutkun NA. Recurrent neural network for complex survival problems. J STAT COMPUT SIM 2023. [DOI: 10.1080/00949655.2023.2176504] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/18/2023]
Affiliation(s)
- Pius Marthin
- Department of Statistics, Graduate School of Science and Engineering, Hacettepe University, Ankara, Turkey
| | - N. Ata Tutkun
- Department of Statistics, Graduate School of Science and Engineering, Hacettepe University, Ankara, Turkey
| |
Collapse
|
58
|
Oflaz Z, Yozgatligil C, Selcuk-Kestel AS. Modeling comorbidity of chronic diseases using coupled hidden Markov model with bivariate discrete copula. Stat Methods Med Res 2023; 32:829-849. [PMID: 36775994 DOI: 10.1177/09622802231155100] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/14/2023]
Abstract
A range of chronic diseases have a significant influence on each other and share common risk factors. Comorbidity, which shows the existence of two or more diseases interacting or triggering each other, is an important measure for actuarial valuations. The main proposal of the study is to model parallel interacting processes describing two or more chronic diseases by a combination of hidden Markov theory and copula function. This study introduces a coupled hidden Markov model with the bivariate discrete copula function in the hidden process. To estimate the parameters of the model and deal with the numerical intractability of the log-likelihood, we use a variational expectation maximization algorithm. To perform the variational expectation maximization algorithm, a lower bound of the model's log-likelihood is defined, and estimators of the parameters are computed in the M-part. A possible numerical underflow occurring in the computation of forward-backward probabilities is solved. The simulation study is conducted for two different levels of association to assess the performance of the proposed model, resulting in satisfactory findings. The proposed model was applied to hospital appointment data from a private hospital. The model defines the dependency structure of unobserved disease data and its dynamics. The application results demonstrate that the model is useful for investigating disease comorbidity when only population dynamics over time and no clinical data are available.
Collapse
Affiliation(s)
- Zarina Oflaz
- Department of Industrial Engineering, 218507KTO Karatay University, Konya, Turkey
| | - Ceylan Yozgatligil
- Department of Statistics, 52984Middle East Technical University, Ankara, Turkey
| | | |
Collapse
|
59
|
Guo C, Ye Y, Yuan Y, Bao J, Mao G, Chen H, Bao J, Mao G, Chen H. Reply to Comment on: Development and Validation of a Novel Nomogram for Predicting the Occurrence of Myopia in Schoolchildren: A Prospective Cohort Study. Am J Ophthalmol 2023; 246:275-276. [PMID: 36306829 DOI: 10.1016/j.ajo.2022.10.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2022] [Accepted: 10/06/2022] [Indexed: 01/24/2023]
Affiliation(s)
- Chengnan Guo
- Division of Epidemiology and Health Statistics, Department of Preventive Medicine, School of Public Health & Management, Wenzhou Medical University, Wenzhou, Zhejiang, China
| | - Yingying Ye
- Eye Hospital, School of Ophthalmology and Optometry, Wenzhou Medical University, Wenzhou, Zhejiang, China; National Clinical Research Center for Ocular Diseases, Wenzhou, Zhejiang, China; WEIRC, Wenzhou Medical University-Essilor International Research Centre, Wenzhou Medical University, Wenzhou, Zhejiang, China
| | - Yimin Yuan
- Eye Hospital, School of Ophthalmology and Optometry, Wenzhou Medical University, Wenzhou, Zhejiang, China; National Clinical Research Center for Ocular Diseases, Wenzhou, Zhejiang, China; WEIRC, Wenzhou Medical University-Essilor International Research Centre, Wenzhou Medical University, Wenzhou, Zhejiang, China
| | - Jinhua Bao
- Eye Hospital, School of Ophthalmology and Optometry, Wenzhou Medical University, Wenzhou, Zhejiang, China; National Clinical Research Center for Ocular Diseases, Wenzhou, Zhejiang, China; WEIRC, Wenzhou Medical University-Essilor International Research Centre, Wenzhou Medical University, Wenzhou, Zhejiang, China
| | - Guangyun Mao
- Division of Epidemiology and Health Statistics, Department of Preventive Medicine, School of Public Health & Management, Wenzhou Medical University, Wenzhou, Zhejiang, China; Eye Hospital, School of Ophthalmology and Optometry, Wenzhou Medical University, Wenzhou, Zhejiang, China; National Clinical Research Center for Ocular Diseases, Wenzhou, Zhejiang, China
| | - Hao Chen
- Eye Hospital, School of Ophthalmology and Optometry, Wenzhou Medical University, Wenzhou, Zhejiang, China; National Clinical Research Center for Ocular Diseases, Wenzhou, Zhejiang, China
| | | | | | | |
Collapse
|
60
|
Nguyen HT, Vasconcellos HD, Keck K, Reis JP, Lewis CE, Sidney S, Lloyd-Jones DM, Schreiner PJ, Guallar E, Wu CO, Lima JA, Ambale-Venkatesh B. Multivariate longitudinal data for survival analysis of cardiovascular event prediction in young adults: insights from a comparative explainable study. BMC Med Res Methodol 2023; 23:23. [PMID: 36698064 PMCID: PMC9878947 DOI: 10.1186/s12874-023-01845-4] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2022] [Accepted: 01/18/2023] [Indexed: 01/26/2023] Open
Abstract
BACKGROUND Multivariate longitudinal data are under-utilized for survival analysis compared to cross-sectional data (CS - data collected once across cohort). Particularly in cardiovascular risk prediction, despite available methods of longitudinal data analysis, the value of longitudinal information has not been established in terms of improved predictive accuracy and clinical applicability. METHODS We investigated the value of longitudinal data over and above the use of cross-sectional data via 6 distinct modeling strategies from statistics, machine learning, and deep learning that incorporate repeated measures for survival analysis of the time-to-cardiovascular event in the Coronary Artery Risk Development in Young Adults (CARDIA) cohort. We then examined and compared the use of model-specific interpretability methods (Random Survival Forest Variable Importance) and model-agnostic methods (SHapley Additive exPlanation (SHAP) and Temporal Importance Model Explanation (TIME)) in cardiovascular risk prediction using the top-performing models. RESULTS In a cohort of 3539 participants, longitudinal information from 35 variables that were repeatedly collected in 6 exam visits over 15 years improved subsequent long-term (17 years after) risk prediction by up to 8.3% in C-index compared to using baseline data (0.78 vs. 0.72), and up to approximately 4% compared to using the last observed CS data (0.75). Time-varying AUC was also higher in models using longitudinal data (0.86-0.87 at 5 years, 0.79-0.81 at 10 years) than using baseline or last observed CS data (0.80-0.86 at 5 years, 0.73-0.77 at 10 years). Comparative model interpretability analysis revealed the impact of longitudinal variables on model prediction on both the individual and global scales among different modeling strategies, as well as identifying the best time windows and best timing within that window for event prediction. The best strategy to incorporate longitudinal data for accuracy was time series massive feature extraction, and the easiest interpretable strategy was trajectory clustering. CONCLUSION Our analysis demonstrates the added value of longitudinal data in predictive accuracy and epidemiological utility in cardiovascular risk survival analysis in young adults via a unified, scalable framework that compares model performance and explainability. The framework can be extended to a larger number of variables and other longitudinal modeling methods. TRIAL REGISTRATION ClinicalTrials.gov Identifier: NCT00005130, Registration Date: 26/05/2000.
Collapse
Affiliation(s)
- Hieu T. Nguyen
- grid.21107.350000 0001 2171 9311Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD USA
| | - Henrique D. Vasconcellos
- grid.21107.350000 0001 2171 9311Department of Cardiology, Johns Hopkins University, Baltimore, MD USA
| | - Kimberley Keck
- grid.21107.350000 0001 2171 9311Department of Cardiology, Johns Hopkins University, Baltimore, MD USA
| | - Jared P. Reis
- grid.279885.90000 0001 2293 4638National Heart, Lung, and Blood Institute, Bethesda, MD USA
| | - Cora E. Lewis
- grid.265892.20000000106344187Department of Epidemiology, School of Public Health, University of Alabama at Birmingham, Birmingham, AL USA
| | - Steven Sidney
- grid.280062.e0000 0000 9957 7758Division of Research, Kaiser Permanente, Oakland, CA USA
| | - Donald M. Lloyd-Jones
- grid.16753.360000 0001 2299 3507Department of Preventive Medicine, Northwestern University, Chicago, IL USA
| | - Pamela J. Schreiner
- grid.17635.360000000419368657School of Public Health, University of Minnesota, Minneapolis, MN USA
| | - Eliseo Guallar
- grid.21107.350000 0001 2171 9311Department of Epidemiology, Johns Hopkins University School of Public Health, Baltimore, MD USA
| | - Colin O. Wu
- grid.279885.90000 0001 2293 4638National Heart, Lung, and Blood Institute, Bethesda, MD USA
| | - João A.C. Lima
- grid.21107.350000 0001 2171 9311Department of Cardiology, Johns Hopkins University, Baltimore, MD USA ,grid.21107.350000 0001 2171 9311Department of Radiology, Johns Hopkins University, Baltimore, MD USA
| | - Bharath Ambale-Venkatesh
- grid.21107.350000 0001 2171 9311Department of Radiology, Johns Hopkins University, Baltimore, MD USA
| |
Collapse
|
61
|
Li Y, Liang D, Ma S, Ma C. Spatio-temporally smoothed deep survival neural network. J Biomed Inform 2023; 137:104255. [PMID: 36462600 PMCID: PMC9845179 DOI: 10.1016/j.jbi.2022.104255] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2022] [Revised: 11/16/2022] [Accepted: 11/24/2022] [Indexed: 12/03/2022]
Abstract
The analysis of registry data has important implications for cancer monitoring, control, and treatment. In such analysis, (semi)parametric models, such as the Cox Proportional Hazards model, have been routinely adopted. In recent years, deep neural network (DNN) has been shown to excel in many fields with its flexibility and superior prediction performance, and it has been applied to the analysis of cancer survival data. Cancer registry data usually has a broad spatial and temporal coverage, leading to significant heterogeneity. Published studies have suggested that it is not sensible to fit one model for all spatial and temporal locations combined. On the other hand, it is inefficient to fit one model for each spatial/temporal location separately. Motivated by such considerations, in this study, we develop a spatio-temporally smoothed DNN approach for the analysis of cancer registry data with a (censored) survival outcome. This approach can accommodate the significant differences across time and space, while recognizing that the spatial and temporal changes are smooth. It is effectively realized via cutting-edge optimization techniques. To draw more definitive conclusions, we also develop an approach for assessing the importance of each individual input variable. Data on head and neck cancer (HNC) and pancreatic cancer from the Surveillance, Epidemiology, and End Results (SEER) database is analyzed. Compared to direct competitors, the proposed approach leads to network architectures that are smoother. Evaluated using the time-dependent Concordance-Index, it has a better prediction performance. The important variables are also biomedically sensible. Overall, this study can deliver a new and effective tool for deciphering cancer survival at the population level.
Collapse
Affiliation(s)
- Yang Li
- Center for Applied Statistics and School of Statistics, Renmin University of China, Beijing, China
| | - Dongzuo Liang
- Center for Applied Statistics and School of Statistics, Renmin University of China, Beijing, China
| | - Shuangge Ma
- Department of Biostatistics, Yale University, New Haven, CT, USA
| | - Chenjin Ma
- Department of Statistics and Data Science, Beijing University of Technology, Beijing, China.
| |
Collapse
|
62
|
TERTIAN: Clinical Endpoint Prediction in ICU via Time-Aware Transformer-Based Hierarchical Attention Network. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:4207940. [PMID: 36567811 PMCID: PMC9788893 DOI: 10.1155/2022/4207940] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/23/2022] [Revised: 11/19/2022] [Accepted: 11/22/2022] [Indexed: 12/23/2022]
Abstract
Accurately predicting the clinical endpoint in ICU based on the patient's electronic medical records (EMRs) is essential for the timely treatment of critically ill patients and allocation of medical resources. However, the patient's EMRs usually consist of a large amount of heterogeneous multivariate time series data such as laboratory tests and vital signs, which are produced irregularly. Most existing methods fail to effectively model the time irregularity inherent in longitudinal patient medical records and capture the interrelationships among different types of data. To tackle these limitations, we propose a novel time-aware transformer-based hierarchical attention network (TERTIAN) for clinical endpoint prediction. In this model, a time-aware transformer is introduced to learn the personalized irregular temporal patterns of medical events, and a hierarchical attention mechanism is deployed to get the accurate patient fusion representation by comprehensively mining the interactions and correlations among multiple types of medical data. We evaluate our model on the MIMIC-III dataset and MIMIC-IV dataset for the task of mortality prediction, and the results show that TERTIAN achieves higher performance than state-of-the-art approaches.
Collapse
|
63
|
Hong C, Chen J, Yi F, Hao Y, Meng F, Dong Z, Lin H, Huang Z. CD-Surv: a contrastive-based model for dynamic survival analysis. Health Inf Sci Syst 2022; 10:5. [PMID: 35494891 PMCID: PMC9005562 DOI: 10.1007/s13755-022-00173-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2021] [Accepted: 03/30/2022] [Indexed: 11/26/2022] Open
Abstract
Survival analysis, aimed at investigating the relationships between covariates and event time, has exhibited profound effects on health service management. Longitudinal data with sequential patterns, such as electronic health records (EHRs), contain a large volume of patient treatment trajectories, and therefore, provide great potential for survival analysis. However, most existing studies address the survival analysis problem in a static manner, that is, they only utilize a fraction of longitudinal data, ignore the correlations between multiple visits, and usually may not be able to capture the latent representations of patient treatment trajectories. This inevitably deteriorates the performance of the survival analysis. To address this challenge, we propose an end-to-end contrastive-based model CD-Surv to better understand the patient treatment trajectories and dynamically predict the survival probability of a target patient. Specifically, two data augmentation strategies, namely, mask generation and shuffle generation, are adopted to augment the real treatment trajectories documented in the EHR. Based on this, the hidden representations of the real trajectories can be improved by utilizing contrastive learning between augmented and real trajectories. We evaluated our proposed CD-Surv on two real-world datasets, and the experimental results indicated that our proposed model could outperform state-of-the-art baselines on various evaluation metrics.
Collapse
Affiliation(s)
- Caogen Hong
- Zhejiang University, Hangzhou, Zhejiang China
- Jiangsu Automation Research Institute, Lianyungang, China
| | | | - Fan Yi
- Zhejiang University, Hangzhou, Zhejiang China
| | - Yuzhe Hao
- Jiangsu Automation Research Institute, Lianyungang, China
| | - Fanwen Meng
- Jiangsu Automation Research Institute, Lianyungang, China
| | | | - Hui Lin
- Zhejiang University, Hangzhou, Zhejiang China
| | | |
Collapse
|
64
|
Chu J, Zhang Y, Huang F, Si L, Huang S, Huang Z. Disentangled representation for sequential treatment effect estimation. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022; 226:107175. [PMID: 36242866 DOI: 10.1016/j.cmpb.2022.107175] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/08/2022] [Revised: 10/04/2022] [Accepted: 10/04/2022] [Indexed: 06/16/2023]
Abstract
BACKGROUND AND OBJECTIVE Treatment effect estimation, as a fundamental problem in causal inference, focuses on estimating the outcome difference between different treatments. However, in clinical observational data, some patient covariates (such as gender, age) not only affect the outcomes but also affect the treatment assignment. Such covariates, named as confounders, produce distribution discrepancies between different treatment groups, thereby introducing the selection bias for the estimation of treatment effects. The situation is even more complicated in longitudinal data, because the confounders are time-varying that are subject to patient history and meanwhile affect the future outcomes and treatment assignments. Existing methods mainly work on cross-sectional data obtained at a specific time point, but cannot process the time-varying confounders hidden in the longitudinal data. METHODS In this study, we address this problem for the first time by disentangled representation learning, which considers the observational data as consisting of three components, including outcome-specific factors, treatment-specific factors, and time-varying confounders. Based on this, the proposed approach adopts a recurrent neural network-based framework to process sequential information and learn the disentangled representations of the components from longitudinal observational sequences, captures the posterior distributions of latent factors by multi-task learning strategy. Moreover, mutual information-based regularization is adopted to eliminate the time-varying confounders. In this way, the association between patient history and treatment assignment is removed and the estimation can be effectively conducted. RESULTS We evaluate our model in a realistic set-up using a model of tumor growth. The proposed model achieves the best performance over benchmark models for both one-step ahead prediction (0.70% vs 0.74% for the-state-of-the-art model, when γ = 3. Measured by normalized root mean square error, the lower the better) and five-step ahead prediction (1.47% vs 1.83%) in most cases. By increasing the effect of confounders, our proposed model always shows superiority against the state-of-the-art model. In addition, we adopted T-SNE to visualize the disentangled representations and present the effectiveness of disentanglement explicitly and intuitively. CONCLUSIONS The experimental results indicate the powerful capacity of our model in learning disentangled representations from longitudinal observational data and dealing with the time-varying confounders, and demonstrate the surpassing performance achieved by our proposed model on dynamic treatment effect estimation.
Collapse
Affiliation(s)
- Jiebin Chu
- Zhejiang University, Hangzhou, Zhejiang Province, China
| | - Yaoyun Zhang
- Alibaba Group, Hangzhou, Zhejiang Province, China
| | - Fei Huang
- Alibaba Group, Hangzhou, Zhejiang Province, China
| | - Luo Si
- Alibaba Group, Hangzhou, Zhejiang Province, China
| | | | | |
Collapse
|
65
|
P D, C G. A systematic review on machine learning and deep learning techniques in cancer survival prediction. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2022; 174:62-71. [PMID: 35933043 DOI: 10.1016/j.pbiomolbio.2022.07.004] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Revised: 07/13/2022] [Accepted: 07/19/2022] [Indexed: 06/15/2023]
Abstract
Cancer is a disease which is characterised by the unusual and uncontrollable growth of body cells. This usually happens asymptomatically and gets spread to other parts of the body. The major problem in treating cancer is that its progress is not monitored once it is diagnosed. The progress or the prognosis can be done through survival analysis. The survival analysis is the branch of statistics that deals in predicting the time of event of occurrence. In the case of cancer prognosis the event is the survival time of the patient from the onset of the disease or it can be the recurrence of the disease after undergoing a treatment. This study aims to bring out the machine learning and deep learning models involved in providing the prognosis to the cancer patients.
Collapse
Affiliation(s)
- Deepa P
- School of Information Technology and Engineering, Vellore Institute of Technology, Vellore, India
| | - Gunavathi C
- School of Computer Science and Engineering, Vellore Institute of Technology, Vellore, India.
| |
Collapse
|
66
|
Abstract
Neonatal care is becoming increasingly complex with large amounts of rich, routinely recorded physiological, diagnostic and outcome data. Artificial intelligence (AI) has the potential to harness this vast quantity and range of information and become a powerful tool to support clinical decision making, personalised care, precise prognostics, and enhance patient safety. Current AI approaches in neonatal medicine include tools for disease prediction and risk stratification, neurological diagnostic support and novel image recognition technologies. Key to the integration of AI in neonatal medicine is the understanding of its limitations and a standardised critical appraisal of AI tools. Barriers and challenges to this include the quality of datasets used, performance assessment, and appropriate external validation and clinical impact studies. Improving digital literacy amongst healthcare professionals and cross-disciplinary collaborations are needed to harness the full potential of AI to help take the next significant steps in improving neonatal outcomes for high-risk infants.
Collapse
|
67
|
Guo C, Ye Y, Yuan Y, Wong YL, Li X, Huang Y, Bao J, Mao G, Chen H. Development and validation of a novel nomogram for predicting the occurrence of myopia in schoolchildren: A prospective cohort study. Am J Ophthalmol 2022; 242:96-106. [PMID: 35750213 DOI: 10.1016/j.ajo.2022.05.027] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2021] [Revised: 03/06/2022] [Accepted: 05/31/2022] [Indexed: 01/31/2023]
Abstract
PURPOSE Myopia is a major public health issue and occurs at young ages. Apart from its high prevalence, myopia results in high costs and irreversible blinding diseases. Accurate prediction of the risk of myopia onset is crucial for its precise prevention. We aimed to develop and validate an effective nomogram for predicting myopia onset in schoolchildren. DESIGN School-based prospective cohort study. METHODS A total of 1073 schoolchildren were enrolled from November 2014 to May 2019 in China, and were divided into the training and validation cohorts. Myopia was defined as a spherical equivalent refraction (SER) ≤-0.5 diopters. Predictors of myopia were determined through the least absolute shrinkage and selection operator regression and multivariable Cox proportional hazard model based on the training cohort. The predictive performance of the nomogram was validated internally through time-dependent receiver operating characteristic (ROC) curves, calibration plot, decision curve analysis, and Kaplan-Meier curves. RESULTS Independent predictors at baseline including gender, SER, axial length, corneal refractive power, and positive relative accommodation were included in the nomogram prediction model. This nomogram demonstrated excellent calibration, clinical net benefit, and discrimination, with all the area under the ROC curves (AUCs) between 0.74 and 0.86 in the training and validation cohorts. The Kaplan-Meier curves showed that 3 distinct risk groups stratified through X-tile analysis were well discriminated and robust among subgroups. The Harrell's C-index and net reclassification improvement demonstrated that the nomogram substantially improved compared with previous models. An online myopia risk calculator was generated for better individual prediction. CONCLUTIONS The nomogram provides accurate and individual prediction of myopia onset in schoolchildren. External validation is needed to verify the generalizability of this nomogram.
Collapse
Affiliation(s)
- Chengnan Guo
- Division of Epidemiology and Health Statistics, Department of Preventive Medicine, School of Public Health & Management, Wenzhou Medical University, Wenzhou, Zhejiang, China.
| | - Yingying Ye
- Eye Hospital, School of Ophthalmology and Optometry, Wenzhou Medical University, Wenzhou, Zhejiang, China; National Clinical Research Center for Ocular Diseases, Wenzhou, Zhejiang, China; WEIRC, Wenzhou Medical University-Essilor International Research Centre, Wenzhou Medical University, Wenzhou, Zhejiang, China.
| | - Yimin Yuan
- Eye Hospital, School of Ophthalmology and Optometry, Wenzhou Medical University, Wenzhou, Zhejiang, China; National Clinical Research Center for Ocular Diseases, Wenzhou, Zhejiang, China; WEIRC, Wenzhou Medical University-Essilor International Research Centre, Wenzhou Medical University, Wenzhou, Zhejiang, China
| | - Yee Ling Wong
- WEIRC, Wenzhou Medical University-Essilor International Research Centre, Wenzhou Medical University, Wenzhou, Zhejiang, China; R&D AMERA, Essilor International, Singapore
| | - Xue Li
- Eye Hospital, School of Ophthalmology and Optometry, Wenzhou Medical University, Wenzhou, Zhejiang, China; National Clinical Research Center for Ocular Diseases, Wenzhou, Zhejiang, China; WEIRC, Wenzhou Medical University-Essilor International Research Centre, Wenzhou Medical University, Wenzhou, Zhejiang, China
| | - Yingying Huang
- Eye Hospital, School of Ophthalmology and Optometry, Wenzhou Medical University, Wenzhou, Zhejiang, China; National Clinical Research Center for Ocular Diseases, Wenzhou, Zhejiang, China; WEIRC, Wenzhou Medical University-Essilor International Research Centre, Wenzhou Medical University, Wenzhou, Zhejiang, China
| | - Jinhua Bao
- Eye Hospital, School of Ophthalmology and Optometry, Wenzhou Medical University, Wenzhou, Zhejiang, China; National Clinical Research Center for Ocular Diseases, Wenzhou, Zhejiang, China; WEIRC, Wenzhou Medical University-Essilor International Research Centre, Wenzhou Medical University, Wenzhou, Zhejiang, China.
| | - Guangyun Mao
- Division of Epidemiology and Health Statistics, Department of Preventive Medicine, School of Public Health & Management, Wenzhou Medical University, Wenzhou, Zhejiang, China; Eye Hospital, School of Ophthalmology and Optometry, Wenzhou Medical University, Wenzhou, Zhejiang, China; National Clinical Research Center for Ocular Diseases, Wenzhou, Zhejiang, China
| | - Hao Chen
- Eye Hospital, School of Ophthalmology and Optometry, Wenzhou Medical University, Wenzhou, Zhejiang, China; National Clinical Research Center for Ocular Diseases, Wenzhou, Zhejiang, China
| |
Collapse
|
68
|
Neural Networks for Survival Prediction in Medicine Using Prognostic Factors: A Review and Critical Appraisal. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2022; 2022:1176060. [PMID: 36238497 PMCID: PMC9553343 DOI: 10.1155/2022/1176060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/25/2021] [Revised: 08/26/2022] [Accepted: 09/13/2022] [Indexed: 11/17/2022]
Abstract
Survival analysis deals with the expected duration of time until one or more events of interest occur. Time to the event of interest may be unobserved, a phenomenon commonly known as right censoring, which renders the analysis of these data challenging. Over the years, machine learning algorithms have been developed and adapted to right-censored data. Neural networks have been repeatedly employed to build clinical prediction models in healthcare with a focus on cancer and cardiology. We present the first ever attempt at a large-scale review of survival neural networks (SNNs) with prognostic factors for clinical prediction in medicine. This work provides a comprehensive understanding of the literature (24 studies from 1990 to August 2021, global search in PubMed). Relevant manuscripts are classified as methodological/technical (novel methodology or new theoretical model; 13 studies) or applications (11 studies). We investigate how researchers have used neural networks to fit survival data for prediction. There are two methodological trends: either time is added as part of the input features and a single output node is specified, or multiple output nodes are defined for each time interval. A critical appraisal of model aspects that should be designed and reported more carefully is performed. We identify key characteristics of prediction models (i.e., number of patients/predictors, evaluation measures, calibration), and compare ANN's predictive performance to the Cox proportional hazards model. The median sample size is 920 patients, and the median number of predictors is 7. Major findings include poor reporting (e.g., regarding missing data, hyperparameters) as well as inaccurate model development/validation. Calibration is neglected in more than half of the studies. Cox models are not developed to their full potential and claims for the performance of SNNs are exaggerated. Light is shed on the current state of art of SNNs in medicine with prognostic factors. Recommendations are made for the reporting of clinical prediction models. Limitations are discussed, and future directions are proposed for researchers who seek to develop existing methodology.
Collapse
|
69
|
Mulder ST, Omidvari AH, Rueten-Budde AJ, Huang PH, Kim KH, Bais B, Rousian M, Hai R, Akgun C, van Lennep JR, Willemsen S, Rijnbeek PR, Tax DM, Reinders M, Boersma E, Rizopoulos D, Visch V, Steegers-Theunissen R. Dynamic Digital Twin: Diagnosis, Treatment, Prediction, and Prevention of Disease During the Life Course. J Med Internet Res 2022; 24:e35675. [PMID: 36103220 PMCID: PMC9520391 DOI: 10.2196/35675] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2021] [Revised: 05/31/2022] [Accepted: 06/09/2022] [Indexed: 11/13/2022] Open
Abstract
A digital twin (DT), originally defined as a virtual representation of a physical asset, system, or process, is a new concept in health care. A DT in health care is not a single technology but a domain-adapted multimodal modeling approach incorporating the acquisition, management, analysis, prediction, and interpretation of data, aiming to improve medical decision-making. However, there are many challenges and barriers that must be overcome before a DT can be used in health care. In this viewpoint paper, we build on the current literature, address these challenges, and describe a dynamic DT in health care for optimizing individual patient health care journeys, specifically for women at risk for cardiovascular complications in the preconception and pregnancy periods and across the life course. We describe how we can commit multiple domains to developing this DT. With our cross-domain definition of the DT, we aim to define future goals, trade-offs, and methods that will guide the development of the dynamic DT and implementation strategies in health care.
Collapse
Affiliation(s)
- Skander Tahar Mulder
- Pattern Recognition Lab, Mathematics and Computer Science, Technical University Delft, Delft, Netherlands
| | - Amir-Houshang Omidvari
- Department of Cardiology, Erasmus Medical Center, Rotterdam, Netherlands
- Department of Public Health, Erasmus Medical Center, Rotterdam, Netherlands
| | | | - Pei-Hua Huang
- Department of Medical Ethics and Philosophy, Erasmus Medical Center, Rotterdam, Netherlands
| | - Ki-Hun Kim
- Department of Industrial Engineering, Pusan National University, Busan, Republic of Korea
| | - Babette Bais
- Obstetrics and Gynaecology, Erasmus Medical Center, Rotterdam, Netherlands
| | - Melek Rousian
- Obstetrics and Gynaecology, Erasmus Medical Center, Rotterdam, Netherlands
| | - Rihan Hai
- Web Information Systems Group, Mathematics and Computer Science, Technical University of Delft, Delft, Netherlands
| | - Can Akgun
- Web Information Systems Group, Mathematics and Computer Science, Technical University of Delft, Delft, Netherlands
- Bioelectronics Section, Department of Microelectronics, Faculty of Electrical Engineering, Technical University Delft, Delft, Netherlands
| | | | - Sten Willemsen
- Department of Biostatistics, Erasmus Medical Center, Rotterdam, Netherlands
| | - Peter R Rijnbeek
- Department of Medical Informatics, Erasmus Medical Center, Rotterdam, Netherlands
| | - David Mj Tax
- Pattern Recognition Lab, Mathematics and Computer Science, Technical University Delft, Delft, Netherlands
| | - Marcel Reinders
- Pattern Recognition Lab, Mathematics and Computer Science, Technical University Delft, Delft, Netherlands
| | - Eric Boersma
- Department of Cardiology, Erasmus Medical Center, Rotterdam, Netherlands
| | | | - Valentijn Visch
- Industrial Design, Mathematics and Computer Science, Technical University Delft, Delft, Netherlands
| | | |
Collapse
|
70
|
Longato E, Di Camillo B, Sparacino G, Avogaro A, Fadini GP. Time-resolved trajectory of glucose lowering medications and cardiovascular outcomes in type 2 diabetes: a recurrent neural network analysis. Cardiovasc Diabetol 2022; 21:159. [PMID: 35996111 PMCID: PMC9396779 DOI: 10.1186/s12933-022-01600-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Accepted: 08/09/2022] [Indexed: 11/25/2022] Open
Abstract
Aim Treatment algorithms define lines of glucose lowering medications (GLM) for the management of type 2 diabetes (T2D), but whether therapeutic trajectories are associated with major adverse cardiovascular events (MACE) is unclear. We explored whether the temporal resolution of GLM usage discriminates patients who experienced a 4P-MACE (heart failure, myocardial infarction, stroke, death for all causes). Methods We used an administrative database (Veneto region, North-East Italy, 2011–2018) and implemented recurrent neural networks (RNN) with outcome-specific attention maps. The model input included age, sex, diabetes duration, and a matrix of GLM pattern before the 4P-MACE or censoring. Model output was discrimination, reported as area under receiver characteristic curve (AUROC). Attention maps were produced to show medications whose time-resolved trajectories were the most important for discrimination. Results The analysis was conducted on 147,135 patients for training and model selection and on 10,000 patients for validation. Collected data spanned a period of ~ 6 years. The RNN model efficiently discriminated temporal patterns of GLM ending in a 4P-MACE vs. those ending in an event-free censoring with an AUROC of 0.911 (95% C.I. 0.904–0.919). This excellent performance was significantly better than that of other models not incorporating time-resolved GLM trajectories: (i) a logistic regression on the bag-of-words encoding all GLM ever taken by the patient (AUROC 0.754; 95% C.I. 0.743–0.765); (ii) a model including the sequence of GLM without temporal relationships (AUROC 0.749; 95% C.I. 0.737–0.761); (iii) a RNN model with the same construction rules but including a time-inverted or randomised order of GLM. Attention maps identified the time-resolved pattern of most common first-line (metformin), second-line (sulphonylureas) GLM, and insulin (glargine) as those determining discrimination capacity. Conclusions The time-resolved pattern of GLM use identified patients with subsequent cardiovascular events better than the mere list or sequence of prescribed GLM. Thus, a patient’s therapeutic trajectory could determine disease outcomes.
Collapse
Affiliation(s)
- Enrico Longato
- Department of Information Engineering, University of Padova, 35100, Padua, Italy
| | - Barbara Di Camillo
- Department of Information Engineering, University of Padova, 35100, Padua, Italy.,Department of Comparative Biomedicine and Food Science, University of Padova, 35020, Legnaro, Italy
| | - Giovanni Sparacino
- Department of Information Engineering, University of Padova, 35100, Padua, Italy
| | - Angelo Avogaro
- Department of Medicine DIMED, University of Padova, Via Giustiniani 2, 35100, Padua, Italy
| | - Gian Paolo Fadini
- Department of Medicine DIMED, University of Padova, Via Giustiniani 2, 35100, Padua, Italy.
| |
Collapse
|
71
|
Kim SI, Kang JW, Eun YG, Lee YC. Prediction of survival in oropharyngeal squamous cell carcinoma using machine learning algorithms: A study based on the surveillance, epidemiology, and end results database. Front Oncol 2022; 12:974678. [PMID: 36072804 PMCID: PMC9441569 DOI: 10.3389/fonc.2022.974678] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Accepted: 08/08/2022] [Indexed: 11/28/2022] Open
Abstract
Background We determined appropriate survival prediction machine learning models for patients with oropharyngeal squamous cell carcinoma (OPSCC) using the “Surveillance, Epidemiology, and End Results” (SEER) database. Methods In total, 4039 patients diagnosed with OPSCC between 2004 and 2016 were enrolled in this study. In particular, 13 variables were selected and analyzed: age, sex, tumor grade, tumor size, neck dissection, radiation therapy, cancer directed surgery, chemotherapy, T stage, N stage, M stage, clinical stage, and human papillomavirus (HPV) status. The T-, N-, and clinical staging were reconstructed based on the American Joint Committee on Cancer (AJCC) Staging Manual, 8th Edition. The patients were randomly assigned to a development or test dataset at a 7:3 ratio. The extremely randomized survival tree (EST), conditional survival forest (CSF), and DeepSurv models were used to predict the overall and disease-specific survival in patients with OPSCC. A 10-fold cross-validation on a development dataset was used to build the training and internal validation data for all models. We evaluated the predictive performance of each model using test datasets. Results A higher c-index value and lower integrated Brier score (IBS), root mean square error (RMSE), and mean absolute error (MAE) indicate a better performance from a machine learning model. The C-index was the highest for the DeepSurv model (0.77). The IBS was also the lowest in the DeepSurv model (0.08). However, the RMSE and RAE were the lowest for the CSF model. Conclusions We demonstrated various machine-learning-based survival prediction models. The CSF model showed a better performance in predicting the survival of patients with OPSCC in terms of the RMSE and RAE. In this context, machine learning models based on personalized survival predictions can be used to stratify various complex risk factors. This could help in designing personalized treatments and predicting prognoses for patients.
Collapse
|
72
|
Developing machine learning algorithms for dynamic estimation of progression during active surveillance for prostate cancer. NPJ Digit Med 2022; 5:110. [PMID: 35933478 PMCID: PMC9357044 DOI: 10.1038/s41746-022-00659-w] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Accepted: 07/14/2022] [Indexed: 11/15/2022] Open
Abstract
Active Surveillance (AS) for prostate cancer is a management option that continually monitors early disease and considers intervention if progression occurs. A robust method to incorporate “live” updates of progression risk during follow-up has hitherto been lacking. To address this, we developed a deep learning-based individualised longitudinal survival model using Dynamic-DeepHit-Lite (DDHL) that learns data-driven distribution of time-to-event outcomes. Further refining outputs, we used a reinforcement learning approach (Actor-Critic) for temporal predictive clustering (AC-TPC) to discover groups with similar time-to-event outcomes to support clinical utility. We applied these methods to data from 585 men on AS with longitudinal and comprehensive follow-up (median 4.4 years). Time-dependent C-indices and Brier scores were calculated and compared to Cox regression and landmarking methods. Both Cox and DDHL models including only baseline variables showed comparable C-indices but the DDHL model performance improved with additional follow-up data. With 3 years of data collection and 3 years follow-up the DDHL model had a C-index of 0.79 (±0.11) compared to 0.70 (±0.15) for landmarking Cox and 0.67 (±0.09) for baseline Cox only. Model calibration was good across all models tested. The AC-TPC method further discovered 4 distinct outcome-related temporal clusters with distinct progression trajectories. Those in the lowest risk cluster had negligible progression risk while those in the highest cluster had a 50% risk of progression by 5 years. In summary, we report a novel machine learning approach to inform personalised follow-up during active surveillance which improves predictive power with increasing data input over time.
Collapse
|
73
|
Zhong G, Ding Z, Zhang G, Xu J, Tu B, Zhan A, Zhang Y. The data flow risk monitoring system of the expressway networking system based on deep learning. INT J INTELL SYST 2022. [DOI: 10.1002/int.22972] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
- Guoqing Zhong
- School of Information Engineering East China Jiaotong University Nanchang China
| | - Zhiquan Ding
- School of Information Engineering East China Jiaotong University Nanchang China
| | - Guolong Zhang
- School of Information Engineering East China Jiaotong University Nanchang China
| | - Jianbin Xu
- Department of Transportation of Jiangxi Province Traffic Monitoring and Command Centre Nanchang China
| | - Botao Tu
- School of Information Engineering East China Jiaotong University Nanchang China
| | - Aiyun Zhan
- School of Electrical and Automation Engineering East China Jiaotong University Nanchang China
| | - Yuejin Zhang
- School of Information Engineering East China Jiaotong University Nanchang China
| |
Collapse
|
74
|
Zhu J, Gallego B. Causal inference for observational longitudinal studies using deep survival models. J Biomed Inform 2022; 131:104119. [PMID: 35714819 DOI: 10.1016/j.jbi.2022.104119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2022] [Revised: 05/11/2022] [Accepted: 06/06/2022] [Indexed: 11/28/2022]
Abstract
OBJECTIVE Causal inference for observational longitudinal studies often requires the accurate estimation of treatment effects on time-to-event outcomes in the presence of time-dependent patient history and time-dependent covariates. MATERIALS AND METHODS To tackle this longitudinal treatment effect estimation problem, we have developed a time-variant causal survival (TCS) model that uses the potential outcomes framework with an ensemble of recurrent subnetworks to estimate the difference in survival probabilities and its confidence interval over time as a function of time-dependent covariates and treatments. RESULTS Using simulated survival datasets, the TCS model showed good causal effect estimation performance across scenarios of varying sample dimensions, event rates, confounding and overlapping. However, increasing the sample size was not effective in alleviating the adverse impact of a high level of confounding. In a large clinical cohort study, TCS identified the expected conditional average treatment effect and detected individual treatment effect heterogeneity over time. TCS provides an efficient way to estimate and update individualized treatment effects over time, in order to improve clinical decisions. DISCUSSION The use of a propensity score layer and potential outcome subnetworks helps correcting for selection bias. However, the proposed model is limited in its ability to correct the bias from unmeasured confounding, and more extensive testing of TCS under extreme scenarios such as low overlapping and the presence of unmeasured confounders is desired and left for future work. CONCLUSION TCS fills the gap in causal inference using deep learning techniques in survival analysis. It considers time-varying confounders and treatment options. Its treatment effect estimation can be easily compared with the conventional literature, which uses relative measures of treatment effect. We expect TCS will be particularly useful for identifying and quantifying treatment effect heterogeneity over time under the ever complex observational health care environment.
Collapse
Affiliation(s)
- Jie Zhu
- Centre for Big Data Research in Health (CBDRH), UNSW, Sydney, NSW 2052, Australia.
| | - Blanca Gallego
- Centre for Big Data Research in Health (CBDRH), UNSW, Sydney, NSW 2052, Australia.
| |
Collapse
|
75
|
Bao L, Wang YT, Zhuang JL, Liu AJ, Dong YJ, Chu B, Chen XH, Lu MQ, Shi L, Gao S, Fang LJ, Xiang QQ, Ding YH. Machine Learning–Based Overall Survival Prediction of Elderly Patients With Multiple Myeloma From Multicentre Real-Life Data. Front Oncol 2022; 12:922039. [PMID: 35865475 PMCID: PMC9293757 DOI: 10.3389/fonc.2022.922039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2022] [Accepted: 06/03/2022] [Indexed: 11/26/2022] Open
Abstract
Objective To use machine learning methods to explore overall survival (OS)-related prognostic factors in elderly multiple myeloma (MM) patients. Methods Data were cleaned and imputed using simple imputation methods. Two data resampling methods were implemented to facilitate model building and cross validation. Four algorithms including the cox proportional hazards model (CPH); DeepSurv; DeepHit; and the random survival forest (RSF) were applied to incorporate 30 parameters, such as baseline data, genetic abnormalities and treatment options, to construct a prognostic model for OS prediction in 338 elderly MM patients (>65 years old) from four hospitals in Beijing. The C-index and the integrated Brier score (IBwere used to evaluate model performances. Results The 30 variables incorporated in the models comprised MM baseline data, induction treatment data and maintenance therapy data. The variable importance test showed that the OS predictions were largely affected by the maintenance schema variable. Visualizing the survival curves by maintenance schema, we realized that the immunomodulator group had the best survival rate. C-indexes of 0.769, 0.780, 0.785, 0.798 and IBS score of 0.142, 0.112, 0.108, 0.099 were obtained from the CPH model, DeepSurv, DeepHit, and the RSF model respectively. The RSF model yield best scores from the fivefold cross-validation, and the results showed that different data resampling methods did affect our model results. Conclusion We established an OS model for elderly MM patients without genomic data based on 30 characteristics and treatment data by machine learning.
Collapse
Affiliation(s)
- Li Bao
- Department of Hematology, Beijing Jishuitan Hospital, 4th Clinical Medical College of Peking University, Beijing, China
- *Correspondence: Li Bao,
| | - Yu-tong Wang
- Department of Hematology, Beijing Jishuitan Hospital, 4th Clinical Medical College of Peking University, Beijing, China
| | - Jun-ling Zhuang
- Department of Hematology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences, Beijing, China
| | - Ai-jun Liu
- Department of Hematology, Beijing Chao Yang Hospital, Capital Medical University, Beijing, China
| | - Yu-jun Dong
- Department of Hematology, The First Hospital of Peking University, Beijing, China
| | - Bin Chu
- Department of Hematology, Beijing Jishuitan Hospital, 4th Clinical Medical College of Peking University, Beijing, China
| | - Xiao-huan Chen
- Department of Hematology, Beijing Jishuitan Hospital, 4th Clinical Medical College of Peking University, Beijing, China
| | - Min-qiu Lu
- Department of Hematology, Beijing Jishuitan Hospital, 4th Clinical Medical College of Peking University, Beijing, China
| | - Lei Shi
- Department of Hematology, Beijing Jishuitan Hospital, 4th Clinical Medical College of Peking University, Beijing, China
| | - Shan Gao
- Department of Hematology, Beijing Jishuitan Hospital, 4th Clinical Medical College of Peking University, Beijing, China
| | - Li-juan Fang
- Department of Hematology, Beijing Jishuitan Hospital, 4th Clinical Medical College of Peking University, Beijing, China
| | - Qiu-qing Xiang
- Department of Hematology, Beijing Jishuitan Hospital, 4th Clinical Medical College of Peking University, Beijing, China
| | - Yue-hua Ding
- Department of Hematology, Beijing Jishuitan Hospital, 4th Clinical Medical College of Peking University, Beijing, China
| |
Collapse
|
76
|
Ren J, Liu D, Li G, Duan J, Dong J, Liu Z. Prediction and Risk Stratification of Cardiovascular Disease in Diabetic Kidney Disease Patients. Front Cardiovasc Med 2022; 9:923549. [PMID: 35811691 PMCID: PMC9263287 DOI: 10.3389/fcvm.2022.923549] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2022] [Accepted: 05/30/2022] [Indexed: 11/13/2022] Open
Abstract
BackgroundDiabetic kidney disease (DKD) patients are facing an extremely high risk of cardiovascular disease (CVD), which is a major cause of death for DKD patients. We aimed to build a deep learning model to predict CVD risk among DKD patients and perform risk stratifying, which could help them perform early intervention and improve personal health management.MethodsA retrospective cohort study was conducted to assess the risk of the occurrence of composite cardiovascular disease, which includes coronary heart disease, cerebrovascular diseases, congestive heart failure, and peripheral artery disease, in DKD patients. A least absolute shrinkage and selection operator (LASSO) regression was used to perform the variable selection. A deep learning-based survival model called DeepSurv, based on a feed-forward neural network was developed to predict CVD risk among DKD patients. We compared the model performance with the conventional Cox proportional hazards (CPH) model and the Random survival forest (RSF) model using the concordance index (C-index), the area under the curve (AUC), and integrated Brier scores (IBS).ResultsWe recruited 890 patients diagnosed with DKD in this retrospective study. During a median follow-up of 10.4 months, there are 289 patients who sustained a subsequent CVD. Seven variables, including age, high density lipoprotein (HDL), hemoglobin (Hb), systolic blood pressure (SBP), smoking status, 24 h urinary protein excretion, and total cholesterol (TC), chosen by LASSO regression were used to develop the predictive model. The DeepSurv model showed the best performance, achieved a C-index of 0.767(95% confidence intervals [CI]: 0.717–0.817), AUC of 0.780(95%CI: 0.721–0.839), and IBS of 0.067 in the validation set. Then we used the cut-off value determined by ROC (receiver operating characteristic) curve to divide the patients into different risk groups. Moreover, the DeepSurv model was also applied to develop an online calculation tool for patients to conduct risk monitoring.ConclusionA deep-learning-based predictive model using seven clinical variables can effectively predict CVD risk among DKD patients and perform risk stratification. An online calculator allows its easy implementation.
Collapse
Affiliation(s)
- Jingjing Ren
- Department of Integrated Traditional and Western Nephrology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
- Research Institute of Nephrology, Zhengzhou University, Zhengzhou, China
- Henan Province Research Center for Kidney Disease, Zhengzhou, China
- Key Laboratory of Precision Diagnosis and Treatment for Chronic Kidney Disease in Henan Province, Zhengzhou, China
- Clinical Research Center of Big-data, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Dongwei Liu
- Department of Integrated Traditional and Western Nephrology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
- Research Institute of Nephrology, Zhengzhou University, Zhengzhou, China
- Henan Province Research Center for Kidney Disease, Zhengzhou, China
- Key Laboratory of Precision Diagnosis and Treatment for Chronic Kidney Disease in Henan Province, Zhengzhou, China
| | - Guangpu Li
- Department of Integrated Traditional and Western Nephrology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
- Research Institute of Nephrology, Zhengzhou University, Zhengzhou, China
- Henan Province Research Center for Kidney Disease, Zhengzhou, China
- Key Laboratory of Precision Diagnosis and Treatment for Chronic Kidney Disease in Henan Province, Zhengzhou, China
- Clinical Research Center of Big-data, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Jiayu Duan
- Department of Integrated Traditional and Western Nephrology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
- Research Institute of Nephrology, Zhengzhou University, Zhengzhou, China
- Henan Province Research Center for Kidney Disease, Zhengzhou, China
- Key Laboratory of Precision Diagnosis and Treatment for Chronic Kidney Disease in Henan Province, Zhengzhou, China
- Clinical Research Center of Big-data, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
- Jiayu Duan
| | - Jiancheng Dong
- Clinical Research Center of Big-data, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
- Jiancheng Dong
| | - Zhangsuo Liu
- Department of Integrated Traditional and Western Nephrology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
- Research Institute of Nephrology, Zhengzhou University, Zhengzhou, China
- Henan Province Research Center for Kidney Disease, Zhengzhou, China
- Key Laboratory of Precision Diagnosis and Treatment for Chronic Kidney Disease in Henan Province, Zhengzhou, China
- *Correspondence: Zhangsuo Liu
| |
Collapse
|
77
|
Yang L, Fan X, Qin W, Xu Y, Zou B, Fan B, Wang S, Dong T, Wang L. A novel deep learning prognostic system improves survival predictions for stage III non-small cell lung cancer. Cancer Med 2022; 11:4246-4255. [PMID: 35491970 PMCID: PMC9678103 DOI: 10.1002/cam4.4782] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2021] [Revised: 03/14/2022] [Accepted: 04/10/2022] [Indexed: 11/30/2022] Open
Abstract
Background Accurate prognostic prediction plays a crucial role in the clinical setting. However, the TNM staging system fails to provide satisfactory individual survival prediction for stage III non‐small cell lung cancer (NSCLC). The performance of the deep learning network for survival prediction in stage III NSCLC has not been explored. Objectives This study aimed to develop a deep learning‐based prognostic system that could achieve better predictive performance than the existing staging system for stage III NSCLC. Methods In this study, a deep survival learning model (DSLM) for stage III NSCLC was developed based on the Surveillance, Epidemiology, and End Results (SEER) database and was independently tested with another external cohort from our institute. DSLM was compared with the Cox proportional hazard (CPH) and random survival forest (RSF) models. A new prognostic system for stage III NSCLC was also proposed based on the established deep learning model. Results The study included 16,613 patients with stage III NSCLC from the SEER database. DSLM showed the best performance in survival prediction, with a C‐index of 0.725 in the validation set, followed by RSF (0.688) and CPH (0.683). DSLM also showed C‐indices of 0.719 and 0.665 in the internal and real‐world external testing datasets, respectively. In addition, the new prognostic system based on DSLM (AUROC = 0.744) showed better performance than the TNM staging system (AUROC = 0.561). Conclusion In this study, a new, integrated deep learning‐based prognostic model was developed and evaluated for stage III NSCLC. This novel approach may be valuable in improving patient stratification and potentially provide meaningful prognostic information that contributes to personalized therapy.
Collapse
Affiliation(s)
- Linlin Yang
- Cheeloo College of Medicine, Shandong University, Jinan, China
| | - Xinyu Fan
- Cheeloo College of Medicine, Shandong University, Jinan, China
| | - Wenru Qin
- Department of Radiation Oncology, Shandong Cancer Hospital and Institute, Shandong First Medical University and Shandong Academy of Medical Science, Jinan, China.,Weifang Medical University, Weifang, China
| | - Yiyue Xu
- Department of Radiation Oncology, Shandong Cancer Hospital and Institute, Shandong First Medical University and Shandong Academy of Medical Science, Jinan, China
| | - Bing Zou
- Department of Radiation Oncology, Shandong Cancer Hospital and Institute, Shandong First Medical University and Shandong Academy of Medical Science, Jinan, China
| | - Bingjie Fan
- Department of Radiation Oncology, Shandong Cancer Hospital and Institute, Shandong First Medical University and Shandong Academy of Medical Science, Jinan, China
| | - Shijiang Wang
- Department of Radiation Oncology, Shandong Cancer Hospital and Institute, Shandong First Medical University and Shandong Academy of Medical Science, Jinan, China
| | - Taotao Dong
- Department of Obstetrics and Gynecology, Qilu Hospital of Shandong University, Jinan, China
| | - Linlin Wang
- Cheeloo College of Medicine, Shandong University, Jinan, China.,Department of Radiation Oncology, Shandong Cancer Hospital and Institute, Shandong First Medical University and Shandong Academy of Medical Science, Jinan, China
| |
Collapse
|
78
|
Hong C, Yi F, Huang Z. Deep-CSA: Deep Contrastive Learning for Dynamic Survival Analysis with Competing Risks. IEEE J Biomed Health Inform 2022; 26:4248-4257. [PMID: 35412993 DOI: 10.1109/jbhi.2022.3161145] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Survival analysis (SA) is widely used to analyze data in which the time until the event is of interest. Conventional SA techniques assume a specific form for viewing the distribution of survival time as the hitting time of a stochastic process, and explicitly model the relationship between covariates and the distribution of the events hitting time. Although valuable, existing SA models seldom consider to model the dynamic correlations between covariates and more than one event of interest (i.e., competing risks) in the disease progression of subjects. To alleviate this critical problem, we propose a novel deep contrastive learning model to obtain a deep understanding of disease progression of subjects with competing risks from their longitudinal observational data. Specifically, we design a self-supervised objective for learning dynamic representations of subjects suffering from multiple competing risks, such that the relationship between covariates and each specific competing risk changes over time can be well captured. Experiments on two open-source clinical dataset-s, i.e., MIMIC-III and EICU, demonstrate the effectiveness of our proposed model, with remarkable improvements over the state-of-the-art SA models.
Collapse
|
79
|
Cottin A, Pecuchet N, Zulian M, Guilloux A, Katsahian S. IDNetwork: A deep illness‐death network based on multi‐state event history process for disease prognostication. Stat Med 2022; 41:1573-1598. [DOI: 10.1002/sim.9310] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2021] [Revised: 10/28/2021] [Accepted: 12/17/2021] [Indexed: 11/12/2022]
Affiliation(s)
- Aziliz Cottin
- Healthcare and Life Sciences Research Dassault Systemes Velizy‐Villacoublay France
| | - Nicolas Pecuchet
- Healthcare and Life Sciences Research Dassault Systemes Velizy‐Villacoublay France
| | - Marine Zulian
- Healthcare and Life Sciences Research Dassault Systemes Velizy‐Villacoublay France
| | - Agathe Guilloux
- CNRS Université Paris‐Saclay Paris France
- Laboratoire de Mathématiques et Modélisation d'Evry Université d'Evry Evry‐Courcouronnes France
| | - Sandrine Katsahian
- AP‐HP Hôpital Européen Georges Pompidou, Unité de Recherche Clinique, APHP Centre Paris France
- Inserm Centre d'Investigation Clinique 1418 (CIC1418) Epidémiologie Clinique Paris France
- Inserm Centre de recherche des Cordeliers, Sorbonne Université, Université de Paris Paris France
- HeKA, INRIA PARIS Paris France
| |
Collapse
|
80
|
Zhu J, Jiang M, Liu Z. Fault Detection and Diagnosis in Industrial Processes with Variational Autoencoder: A Comprehensive Study. SENSORS (BASEL, SWITZERLAND) 2021; 22:227. [PMID: 35009769 PMCID: PMC8749793 DOI: 10.3390/s22010227] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Revised: 12/13/2021] [Accepted: 12/27/2021] [Indexed: 06/14/2023]
Abstract
This work considers industrial process monitoring using a variational autoencoder (VAE). As a powerful deep generative model, the variational autoencoder and its variants have become popular for process monitoring. However, its monitoring ability, especially its fault diagnosis ability, has not been well investigated. In this paper, the process modeling and monitoring capabilities of several VAE variants are comprehensively studied. First, fault detection schemes are defined in three distinct ways, considering latent, residual, and the combined domains. Afterwards, to conduct the fault diagnosis, we first define the deep contribution plot, and then a deep reconstruction-based contribution diagram is proposed for deep domains under the fault propagation mechanism. In a case study, the performance of the process monitoring capability of four deep VAE models, namely, the static VAE model, the dynamic VAE model, and the recurrent VAE models (LSTM-VAE and GRU-VAE), has been comparatively evaluated on the industrial benchmark Tennessee Eastman process. Results show that recurrent VAEs with a deep reconstruction-based diagnosis mechanism are recommended for industrial process monitoring tasks.
Collapse
Affiliation(s)
- Jinlin Zhu
- State Key Laboratory of Food Science and Technology, Jiangnan University, Wuxi 214122, China
- School of Food Science and Technology, Jiangnan University, Wuxi 214122, China
| | - Muyun Jiang
- School of Computer Science and Engineering, Nanyang Technological University, Singapore 639798, Singapore;
| | - Zhong Liu
- Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), Jiangnan University, Wuxi 214122, China;
| |
Collapse
|
81
|
Establishment of a Predictive Model for GvHD-free, Relapse-free Survival after Allogeneic HSCT using Ensemble Learning. Blood Adv 2021; 6:2618-2627. [PMID: 34933327 PMCID: PMC9043925 DOI: 10.1182/bloodadvances.2021005800] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2021] [Accepted: 11/23/2021] [Indexed: 12/03/2022] Open
Abstract
Stacked ensemble of machine-learning algorithms could establish more accurate prediction model for survival analysis than existing methods. Stacked ensemble model can be applied to personalized prediction of HSCT outcomes from pretransplant characteristics.
Graft-versus-host disease-free, relapse-free survival (GRFS) is a useful composite end point that measures survival without relapse or significant morbidity after allogeneic hematopoietic stem cell transplantation (allo-HSCT). We aimed to develop a novel analytical method that appropriately handles right-censored data and competing risks to understand the risk for GRFS and each component of GRFS. This study was a retrospective data-mining study on a cohort of 2207 adult patients who underwent their first allo-HSCT within the Kyoto Stem Cell Transplantation Group, a multi-institutional joint research group of 17 transplantation centers in Japan. The primary end point was GRFS. A stacked ensemble of Cox Proportional Hazard (Cox-PH) regression and 7 machine-learning algorithms was applied to develop a prediction model. The median age for the patients was 48 years. For GRFS, the stacked ensemble model achieved better predictive accuracy evaluated by C-index than other state-of-the-art competing risk models (ensemble model: 0.670; Cox-PH: 0.668; Random Survival Forest: 0.660; Dynamic DeepHit: 0.646). The probability of GRFS after 2 years was 30.54% for the high-risk group and 40.69% for the low-risk group (hazard ratio compared with the low-risk group: 2.127; 95% CI, 1.19-3.80). We developed a novel predictive model for survival analysis that showed superior risk stratification to existing methods using a stacked ensemble of multiple machine-learning algorithms.
Collapse
|
82
|
Yang Z, Tian Y, Zhou T, Zhu Y, Zhang P, Chen J, Li J. Time-series deep survival prediction for hemodialysis patients using an attention-based Bi-GRU network. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2021; 212:106458. [PMID: 34736175 DOI: 10.1016/j.cmpb.2021.106458] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/21/2020] [Accepted: 10/03/2021] [Indexed: 06/13/2023]
Abstract
BACKGROUND AND OBJECTIVE The number of end-stage renal disease (ESRD) patients treated with hemodialysis (HD) has significantly increased, but the prognosis remains poor. Time-series features have been included in only a few studies to predict HD patient survival, and how to utilize such features effectively remains unclear. This article aims to develop a more accurate, interpretable, and clinically practical personalized survival prediction model for HD patients. METHODS This study proposed and evaluated an attention-based Bi-GRU network using time-series features for survival prediction. A distance-based loss function was proposed to improve performance. We used data from 1232 ESRD patients who received regular hemodialysis treatment for ≥ 3 months from 2007 to 2016 at the First Affiliated Hospital of Zhejiang University. The proposed model was compared with representative sequence modeling deep learning architectures and existing survival analysis methods in terms of the C-index and IBS value. Post hoc tests were used to test statistical significance. The attention map was used to assess feature importance over time. The impact of time-series changes on survival was investigated after controlling initial values (using BMI as an example). RESULTS The proposed method outperformed other sequence modeling architectures and the state-of-the-art survival analysis approaches in terms of the C-index and the integrated Brier score (IBS) value. Our method achieved a C-index of 0.7680 (95% confidence intervals [CI]: 0.7645, 0.7716) and an IBS of 0.1302 (95% confidence intervals [CI]: 0.1292, 0.1313), showing an improvement of up to 5.4% in terms of the C-index and a decrease of 3.2% in terms of the IBS value. The addition of the distance-based loss function improved the performance. The predicted risk and actual risk levels closely agreed. This study also found that even after controlling the initial body mass index (BMI) values, different 3-month BMI trends could produce different survival outcomes. CONCLUSIONS This study proposed a more effective and interpretable method to use time-series information in survival analysis. The proposed method may help promote personalized medicine and improve patient prognosis.
Collapse
Affiliation(s)
- Ziyue Yang
- Engineering Research Center of EMR and Intelligent Expert System, Ministry of Education, College of Biomedical Engineering and Instrument Science, Zhejiang University, No. 38 Zheda Road, Hangzhou 310027, China
| | - Yu Tian
- Engineering Research Center of EMR and Intelligent Expert System, Ministry of Education, College of Biomedical Engineering and Instrument Science, Zhejiang University, No. 38 Zheda Road, Hangzhou 310027, China
| | - Tianshu Zhou
- Research Center for Healthcare Data Science, Zhejiang Lab, Hangzhou, China
| | - Yilin Zhu
- Kidney Disease Center, the First Affiliated Hospital, College of Medicine, Zhejiang University, Hangzhou, China
| | - Ping Zhang
- Kidney Disease Center, the First Affiliated Hospital, College of Medicine, Zhejiang University, Hangzhou, China
| | - Jianghua Chen
- Kidney Disease Center, the First Affiliated Hospital, College of Medicine, Zhejiang University, Hangzhou, China
| | - Jingsong Li
- Engineering Research Center of EMR and Intelligent Expert System, Ministry of Education, College of Biomedical Engineering and Instrument Science, Zhejiang University, No. 38 Zheda Road, Hangzhou 310027, China; Research Center for Healthcare Data Science, Zhejiang Lab, Hangzhou, China.
| |
Collapse
|
83
|
Agius R, Parviz M, Niemann CU. Artificial intelligence models in chronic lymphocytic leukemia - recommendations toward state-of-the-art. Leuk Lymphoma 2021; 63:265-278. [PMID: 34612160 DOI: 10.1080/10428194.2021.1973672] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
Artificial intelligence (AI), machine learning and predictive modeling are becoming enabling technologies in many day-to-day applications. Translation of these advances to the patient's bedside for AI assisted interventions is not yet the norm. With specific emphasis on CLL, here, we review the progress of prognostic models in hematology and highlight sources of stagnation that may be limiting significant improvements in prognostication in the near future. We discuss issues related to performance, trust, modeling simplicity, and prognostic marker robustness and find that the major limiting factor in progressing toward state-of-the-art prognostication within the hematological community, is not the lack of able AI algorithms but rather, the lack of their adoption. Current models in CLL still deal with the 'average' patient while the use of patient-centric approaches remains absent. Using lessons from research areas where machine learning has become an enabling technology, we derive recommendations and propose methods for achieving state-of-the-art predictions in modeling health data, that can be readily adopted by the CLL modeling community.
Collapse
Affiliation(s)
- Rudi Agius
- Department of Hematology, Rigshospitalet, Copenhagen University Hospital, Copenhagen, Denmark
| | - Mehdi Parviz
- Department of Hematology, Rigshospitalet, Copenhagen University Hospital, Copenhagen, Denmark
| | - Carsten Utoft Niemann
- Department of Hematology, Rigshospitalet, Copenhagen University Hospital, Copenhagen, Denmark
| |
Collapse
|
84
|
Boškoski P, Perne M, Rameša M, Boshkoska BM. Variational Bayes survival analysis for unemployment modelling. Knowl Based Syst 2021. [DOI: 10.1016/j.knosys.2021.107335] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
85
|
Park JH, Choi J, Lee S, Shin SD, Song KJ. Use of Time-to-Event Analysis to Develop On-Scene Return of Spontaneous Circulation Prediction for Out-of-Hospital Cardiac Arrest Patients. Ann Emerg Med 2021; 79:132-144. [PMID: 34417073 DOI: 10.1016/j.annemergmed.2021.07.121] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Revised: 07/05/2021] [Accepted: 07/14/2021] [Indexed: 12/23/2022]
Abstract
STUDY OBJECTIVE We aimed to train and validate the time to on-scene return of spontaneous circulation prediction models using time-to-event analysis among out-of-hospital cardiac arrest patients. METHODS Using a Korean population-based out-of-hospital cardiac arrest registry, we selected a total of 105,215 adults with presumed cardiac etiologies between 2013 and 2018. Patients from 2013 to 2017 and from 2018 were analyzed for training and test, respectively. We developed 4 time-to-event analyzing models (Cox proportional hazard [Cox], random survival forest, extreme gradient boosting survival, and DeepHit) and 4 classification models (logistic regression, random forest, extreme gradient boosting, and feedforward neural network). Patient characteristics and Utstein elements collected at the scene were used as predictors. Discrimination and calibration were evaluated by Harrell's C-index and integrated Brier score. RESULTS Among the 105,215 patients (mean age 70 years and 64% men), 86,314 and 18,901 patients belonged to the training and test sets, respectively. On-scene return of spontaneous circulation was achieved in 5,240 (6.1%) patients in the former set and 1,709 (9.0%) patients in the latter. The proportion of emergency medical services (EMS) management was higher and scene time interval longer in the latter. Median time from EMS scene arrival to on-scene return of spontaneous circulation was 8 minutes for both datasets. Classification models showed similar discrimination and poor calibration power compared to survival models; Cox showed high discrimination with the best calibration (C-index [95% confidence interval]: 0.873 [0.865 to 0.882]; integrated Brier score at 30 minutes: 0.060). CONCLUSION Incorporating time-to-event analysis could lead to improved performance in prediction models and contribute to personalized field EMS resuscitation decisions.
Collapse
Affiliation(s)
- Jeong Ho Park
- Department of Biomedical Engineering, Seoul National University College of Medicine, Seoul, Korea; Department of Emergency Medicine, Seoul National University College of Medicine and Hospital, Seoul, Korea; Laboratory of Emergency Medical Services, Seoul National University Hospital Biomedical Research Institute, Seoul, Korea
| | - Jinwook Choi
- Department of Biomedical Engineering, Seoul National University College of Medicine, Seoul, Korea; Institute of Medical and Biological Engineering, Medical Research Center, Seoul National University, Seoul, Korea.
| | - SangMyeong Lee
- School of Electrical Engineering, Undergraduate School of Korea University, Seoul, Korea
| | - Sang Do Shin
- Department of Emergency Medicine, Seoul National University College of Medicine and Hospital, Seoul, Korea; Laboratory of Emergency Medical Services, Seoul National University Hospital Biomedical Research Institute, Seoul, Korea
| | - Kyoung Jun Song
- Laboratory of Emergency Medical Services, Seoul National University Hospital Biomedical Research Institute, Seoul, Korea; Department of Emergency Medicine, Seoul National University College of Medicine and Seoul National University Boramae Medical Center, Seoul, Korea
| |
Collapse
|
86
|
Nagpal C, Li X, Dubrawski A. Deep Survival Machines: Fully Parametric Survival Regression and Representation Learning for Censored Data With Competing Risks. IEEE J Biomed Health Inform 2021; 25:3163-3175. [PMID: 33460387 DOI: 10.1109/jbhi.2021.3052441] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
We describe a new approach to estimating relative risks in time-to-event prediction problems with censored data in a fully parametric manner. Our approach does not require making strong assumptions of constant proportional hazards of the underlying survival distribution, as required by the Cox-proportional hazard model. By jointly learning deep nonlinear representations of the input covariates, we demonstrate the benefits of our approach when used to estimate survival risks through extensive experimentation on multiple real world datasets with different levels of censoring. We further demonstrate advantages of our model in the competing risks scenario. To the best of our knowledge, this is the first work involving fully parametric estimation of survival times with competing risks in the presence of censoring.
Collapse
|
87
|
Putzel P, Do H, Boyd A, Zhong H, Smyth P. Dynamic Survival Analysis for EHR Data with Personalized Parametric Distributions. PROCEEDINGS OF MACHINE LEARNING RESEARCH 2021; 149:648-673. [PMID: 35425906 PMCID: PMC9006243] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
The widespread availability of high-dimensional electronic healthcare record (EHR) datasets has led to significant interest in using such data to derive clinical insights and make risk predictions. More specifically, techniques from machine learning are being increasingly applied to the problem of dynamic survival analysis, where updated time-to-event risk predictions are learned as a function of the full covariate trajectory from EHR datasets. EHR data presents unique challenges in the context of dynamic survival analysis, involving a variety of decisions about data representation, modeling, interpretability, and clinically meaningful evaluation. In this paper we propose a new approach to dynamic survival analysis which addresses some of these challenges. Our modeling approach is based on learning a global parametric distribution to represent population characteristics and then dynamically locating individuals on the time-axis of this distribution conditioned on their histories. For evaluation we also propose a new version of the dynamic C-Index for clinically meaningful evaluation of dynamic survival models. To validate our approach we conduct dynamic risk prediction on three real-world datasets, involving COVID-19 severe outcomes, cardiovascular disease (CVD) onset, and primary biliary cirrhosis (PBC) time-to-transplant. We find that our proposed modeling approach is competitive with other well-known statistical and machine learning approaches for dynamic risk prediction, while offering potential advantages in terms of interepretability of predictions at the individual level.
Collapse
Affiliation(s)
- Preston Putzel
- Department of Computer Science, University of California, Irvine, CA, USA
| | - Hyungrok Do
- Department of Population Health, NYU Grossman School of Medicine, New York, NY, USA
| | - Alex Boyd
- Department of Statistics, University of California, Irvine, CA, USA
| | - Hua Zhong
- Department of Population Health, NYU Grossman School of Medicine, New York, NY, USA
| | - Padhraic Smyth
- Department of Computer Science, University of California, Irvine, CA, USA
| |
Collapse
|
88
|
Sun Z, Sun Z, Dong W, Shi J, Huang Z. Towards Predictive Analysis on Disease Progression: A Variational Hawkes Process Model. IEEE J Biomed Health Inform 2021; 25:4195-4206. [PMID: 34329176 DOI: 10.1109/jbhi.2021.3101113] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Massively available longitudinal data about long-term disease trajectories of patients provides a golden mine for the understanding of disease progression and efficient health service delivery. It calls for quantitative modeling of disease progression, which is a tricky problem due to the complexity of the disease progression process as well as the irregularity of time documented in trajectories. In this study, we tackle the problem with the goal of predictively analyzing disease progression. Specifically, we propose a novel Variational Hawkes Process (VHP) model to generalize disease progression and predict future patient states based on the clinical observational data of past disease trajectories. First, Hawkes Process captures the intensity of irregular visits in a trajectory documented on medical facilities and controls the aforementioned information flowing into future visits. Thereafter, the captured intensity is incorporated into a Variational Auto-Encoder to generate the representation of the future partial disease trajectory for a target patient in a predictive manner. To further improve the prediction performance, we equip the proposed model with a disease trajectory discriminator to distinguish the generated trajectories from real ones. We evaluate the proposed model on two public datasets from the MIMIC-III database pertaining to heart failure and sepsis patients, respectively, and one real-world dataset from a Chinese hospital pertaining to heart failure patients with multiple admissions. Experimental results demonstrate that the proposed model significantly outperforms state-of-the-art baselines, and may derive a set of practical implications that can benefit a wide spectrum of management and applications on disease progression.
Collapse
|
89
|
Lee C, Rashbass J, van der Schaar M. Outcome-Oriented Deep Temporal Phenotyping of Disease Progression. IEEE Trans Biomed Eng 2021; 68:2423-2434. [PMID: 33259292 DOI: 10.1109/tbme.2020.3041815] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Chronic diseases evolve slowly throughout a patient's lifetime creating heterogeneous progression patterns that make clinical outcomes remarkably varied across individual patients. A tool capable of identifying temporal phenotypes based on the patients different progression patterns and clinical outcomes would allow clinicians to better forecast disease progression by recognizing a group of similar past patients, and to better design treatment guidelines that are tailored to specific phenotypes. To build such a tool, we propose a deep learning approach, which we refer to as outcome-oriented deep temporal phenotyping (ODTP), to identify temporal phenotypes of disease progression considering what type of clinical outcomes will occur and when based on the longitudinal observations. More specifically, we model clinical outcomes throughout a patient's longitudinal observations via time-to-event (TTE) processes whose conditional intensity functions are estimated as non-linear functions using a recurrent neural network. Temporal phenotyping of disease progression is carried out by our novel loss function that is specifically designed to learn discrete latent representations that best characterize the underlying TTE processes. The key insight here is that learning such discrete representations groups progression patterns considering the similarity in expected clinical outcomes, and thus naturally provides outcome-oriented temporal phenotypes. We demonstrate the power of ODTP by applying it to a real-world heterogeneous cohort of 11 779 stage III breast cancer patients from the U.K. National Cancer Registration and Analysis Service. The experiments show that ODTP identifies temporal phenotypes that are strongly associated with the future clinical outcomes and achieves significant gain on the homogeneity and heterogeneity measures over existing methods. Furthermore, we are able to identify the key driving factors that lead to transitions between phenotypes which can be translated into actionable information to support better clinical decision-making.
Collapse
|
90
|
Long-term cancer survival prediction using multimodal deep learning. Sci Rep 2021; 11:13505. [PMID: 34188098 PMCID: PMC8242026 DOI: 10.1038/s41598-021-92799-4] [Citation(s) in RCA: 74] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2021] [Accepted: 06/16/2021] [Indexed: 01/17/2023] Open
Abstract
The age of precision medicine demands powerful computational techniques to handle high-dimensional patient data. We present MultiSurv, a multimodal deep learning method for long-term pan-cancer survival prediction. MultiSurv uses dedicated submodels to establish feature representations of clinical, imaging, and different high-dimensional omics data modalities. A data fusion layer aggregates the multimodal representations, and a prediction submodel generates conditional survival probabilities for follow-up time intervals spanning several decades. MultiSurv is the first non-linear and non-proportional survival prediction method that leverages multimodal data. In addition, MultiSurv can handle missing data, including single values and complete data modalities. MultiSurv was applied to data from 33 different cancer types and yields accurate pan-cancer patient survival curves. A quantitative comparison with previous methods showed that Multisurv achieves the best results according to different time-dependent metrics. We also generated visualizations of the learned multimodal representation of MultiSurv, which revealed insights on cancer characteristics and heterogeneity.
Collapse
|
91
|
Abroshan M, Alaa AM, Rayner O, van der Schaar M. Opportunities for machine learning to transform care for people with cystic fibrosis. J Cyst Fibros 2021; 19:6-8. [PMID: 32000972 DOI: 10.1016/j.jcf.2020.01.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Affiliation(s)
- Mahed Abroshan
- The Department of Applied Mathematics and Theoretical Physics and The Department of Public Health, University of Cambridge, United Kingdom
| | - Ahmed M Alaa
- Department of Electrical and Computer Engineering, University of California Los Angeles, Los Angeles, California, USA
| | - Oli Rayner
- Person with CF, Plymouth, United Kingdom
| | - Mihaela van der Schaar
- The Department of Applied Mathematics and Theoretical Physics and The Department of Public Health, University of Cambridge, United Kingdom; Department of Electrical and Computer Engineering, University of California Los Angeles, Los Angeles, California, USA.
| |
Collapse
|
92
|
Guan Y, Li H, Yi D, Zhang D, Yin C, Li K, Zhang P. A survival model generalized to regression learning algorithms. ACTA ACUST UNITED AC 2021; 1:433-440. [PMID: 34312611 DOI: 10.1038/s43588-021-00083-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Survival prediction is an important problem that is encountered widely in industry and medicine. Despite the explosion of artificial intelligence technologies, no uniformed method allows the application of any type of regression learning algorithm to a survival prediction problem. Here, we present a statistical modeling method that is generalized to all types of regression learning algorithm, including deep learning. We present its empirical advantage when it is applied to traditional survival problems. We demonstrate its expanded applications in different types of regression learning algorithm, such as gradient boosted trees, convolutional neural networks and recurrent neural networks. Additionally, we demonstrate its application in clinical informatic data, pathological images and the hardware industry. We expect that this algorithm will be widely applicable for diverse types of survival data, including discrete data types and those suitable for deep learning such as those with time or spatial continuity.
Collapse
Affiliation(s)
- Yuanfang Guan
- Department of Computational Medicine and Bioinformatics, Michigan Medicine, University of Michigan, Ann Arbor, MI, USA.,Department of Internal Medicine, University of Michigan, Ann Arbor, MI, USA
| | - Hongyang Li
- Department of Computational Medicine and Bioinformatics, Michigan Medicine, University of Michigan, Ann Arbor, MI, USA
| | - Daiyao Yi
- Department of Biomedical Engineering, Michigan Medicine, University of Michigan, Ann Arbor, MI, USA
| | - Dongdong Zhang
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, USA
| | - Changchang Yin
- Department of Computer Science and Engineering, The Ohio State University, Columbus, OH, USA
| | - Keyu Li
- Department of Computational Medicine and Bioinformatics, Michigan Medicine, University of Michigan, Ann Arbor, MI, USA
| | - Ping Zhang
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, USA.,Department of Computer Science and Engineering, The Ohio State University, Columbus, OH, USA
| |
Collapse
|
93
|
Use of deep learning to develop continuous-risk models for adverse event prediction from electronic health records. Nat Protoc 2021; 16:2765-2787. [PMID: 33953393 DOI: 10.1038/s41596-021-00513-5] [Citation(s) in RCA: 40] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2020] [Accepted: 01/25/2021] [Indexed: 02/03/2023]
Abstract
Early prediction of patient outcomes is important for targeting preventive care. This protocol describes a practical workflow for developing deep-learning risk models that can predict various clinical and operational outcomes from structured electronic health record (EHR) data. The protocol comprises five main stages: formal problem definition, data pre-processing, architecture selection, calibration and uncertainty, and generalizability evaluation. We have applied the workflow to four endpoints (acute kidney injury, mortality, length of stay and 30-day hospital readmission). The workflow can enable continuous (e.g., triggered every 6 h) and static (e.g., triggered at 24 h after admission) predictions. We also provide an open-source codebase that illustrates some key principles in EHR modeling. This protocol can be used by interdisciplinary teams with programming and clinical expertise to build deep-learning prediction models with alternate data sources and prediction tasks.
Collapse
|
94
|
Terranova N, Venkatakrishnan K, Benincosa LJ. Application of Machine Learning in Translational Medicine: Current Status and Future Opportunities. AAPS JOURNAL 2021; 23:74. [PMID: 34008139 PMCID: PMC8130984 DOI: 10.1208/s12248-021-00593-x] [Citation(s) in RCA: 36] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/28/2020] [Accepted: 04/08/2021] [Indexed: 02/06/2023]
Abstract
The exponential increase in our ability to harness multi-dimensional biological and clinical data from experimental to real-world settings has transformed pharmaceutical research and development in recent years, with increasing applications of artificial intelligence (AI) and machine learning (ML). Patient-centered iterative forward and reverse translation is at the heart of precision medicine discovery and development across the continuum from target validation to optimization of pharmacotherapy. Integration of advanced analytics into the practice of Translational Medicine is now a fundamental enabler to fully exploit information contained in diverse sources of big data sets such as “omics” data, as illustrated by deep characterizations of the genome, transcriptome, proteome, metabolome, microbiome, and exposome. In this commentary, we provide an overview of ML applications in drug discovery and development, aligned with the three strategic pillars of Translational Medicine (target, patient, dose) and offer perspectives on their potential to transform the science and practice of the discipline. Opportunities for integrating ML approaches into the discipline of Pharmacometrics are discussed and will revolutionize the practice of model-informed drug discovery and development. Finally, we posit that joint efforts of Clinical Pharmacology, Bioinformatics, and Biomarker Technology experts are vital in cross-functional team settings to realize the promise of AI/ML-enabled Translational and Precision Medicine.
Collapse
Affiliation(s)
- Nadia Terranova
- Translational Medicine, Merck Institute for Pharmacometrics, Merck Serono S.A., Lausanne, Switzerland
| | - Karthik Venkatakrishnan
- Translational Medicine, EMD Serono Research & Development Institute, Inc., Billerica, Massachusetts, USA
| | - Lisa J Benincosa
- Translational Medicine, EMD Serono Research & Development Institute, Inc., Billerica, Massachusetts, USA.
| |
Collapse
|
95
|
Yang S, Zhu F, Ling X, Liu Q, Zhao P. Intelligent Health Care: Applications of Deep Learning in Computational Medicine. Front Genet 2021; 12:607471. [PMID: 33912213 PMCID: PMC8075004 DOI: 10.3389/fgene.2021.607471] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Accepted: 03/05/2021] [Indexed: 12/24/2022] Open
Abstract
With the progress of medical technology, biomedical field ushered in the era of big data, based on which and driven by artificial intelligence technology, computational medicine has emerged. People need to extract the effective information contained in these big biomedical data to promote the development of precision medicine. Traditionally, the machine learning methods are used to dig out biomedical data to find the features from data, which generally rely on feature engineering and domain knowledge of experts, requiring tremendous time and human resources. Different from traditional approaches, deep learning, as a cutting-edge machine learning branch, can automatically learn complex and robust feature from raw data without the need for feature engineering. The applications of deep learning in medical image, electronic health record, genomics, and drug development are studied, where the suggestion is that deep learning has obvious advantage in making full use of biomedical data and improving medical health level. Deep learning plays an increasingly important role in the field of medical health and has a broad prospect of application. However, the problems and challenges of deep learning in computational medical health still exist, including insufficient data, interpretability, data privacy, and heterogeneity. Analysis and discussion on these problems provide a reference to improve the application of deep learning in medical health.
Collapse
Affiliation(s)
- Sijie Yang
- School of Computer Science and Technology, Soochow University, Suzhou, China
| | - Fei Zhu
- School of Computer Science and Technology, Soochow University, Suzhou, China
| | - Xinghong Ling
- School of Computer Science and Technology, Soochow University, Suzhou, China
- WenZheng College of Soochow University, Suzhou, China
| | - Quan Liu
- School of Computer Science and Technology, Soochow University, Suzhou, China
| | - Peiyao Zhao
- School of Computer Science and Technology, Soochow University, Suzhou, China
| |
Collapse
|
96
|
Jin C, Yu H, Ke J, Ding P, Yi Y, Jiang X, Duan X, Tang J, Chang DT, Wu X, Gao F, Li R. Predicting treatment response from longitudinal images using multi-task deep learning. Nat Commun 2021; 12:1851. [PMID: 33767170 PMCID: PMC7994301 DOI: 10.1038/s41467-021-22188-y] [Citation(s) in RCA: 93] [Impact Index Per Article: 23.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Accepted: 03/02/2021] [Indexed: 12/24/2022] Open
Abstract
Radiographic imaging is routinely used to evaluate treatment response in solid tumors. Current imaging response metrics do not reliably predict the underlying biological response. Here, we present a multi-task deep learning approach that allows simultaneous tumor segmentation and response prediction. We design two Siamese subnetworks that are joined at multiple layers, which enables integration of multi-scale feature representations and in-depth comparison of pre-treatment and post-treatment images. The network is trained using 2568 magnetic resonance imaging scans of 321 rectal cancer patients for predicting pathologic complete response after neoadjuvant chemoradiotherapy. In multi-institution validation, the imaging-based model achieves AUC of 0.95 (95% confidence interval: 0.91–0.98) and 0.92 (0.87–0.96) in two independent cohorts of 160 and 141 patients, respectively. When combined with blood-based tumor markers, the integrated model further improves prediction accuracy with AUC 0.97 (0.93–0.99). Our approach to capturing dynamic information in longitudinal images may be broadly used for screening, treatment response evaluation, disease monitoring, and surveillance. Radiographic imaging is routinely used to evaluate treatment response in solid tumors. Here, the authors present a multi-task deep learning approach that allows simultaneous tumor segmentation and response prediction from longitudinal images in a multi-center study on rectal cancer.
Collapse
Affiliation(s)
- Cheng Jin
- Department of Radiation Oncology, Stanford University School of Medicine, Stanford, CA, USA
| | - Heng Yu
- Department of Radiation Oncology, Stanford University School of Medicine, Stanford, CA, USA
| | - Jia Ke
- Department of Colorectal Surgery, The Sixth Affiliated Hospital, Sun Yat-sen University, Guangzhou, China.,Guangdong Institute of Gastroenterology, Guangdong Provincial Key Laboratory of Colorectal and Pelvic Floor Diseases, Guangzhou, China
| | - Peirong Ding
- Department of Colorectal Surgery, Sun Yat-sen University Cancer Center, Guangzhou, China.,Sun Yat-sen University Cancer Center, State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Guangzhou, China
| | - Yongju Yi
- Center for Network Information, The Sixth Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
| | - Xiaofeng Jiang
- Department of Colorectal Surgery, The Sixth Affiliated Hospital, Sun Yat-sen University, Guangzhou, China.,Guangdong Institute of Gastroenterology, Guangdong Provincial Key Laboratory of Colorectal and Pelvic Floor Diseases, Guangzhou, China
| | - Xin Duan
- Department of Colorectal Surgery, The Sixth Affiliated Hospital, Sun Yat-sen University, Guangzhou, China.,Guangdong Institute of Gastroenterology, Guangdong Provincial Key Laboratory of Colorectal and Pelvic Floor Diseases, Guangzhou, China
| | - Jinghua Tang
- Department of Colorectal Surgery, Sun Yat-sen University Cancer Center, Guangzhou, China.,Sun Yat-sen University Cancer Center, State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Guangzhou, China
| | - Daniel T Chang
- Department of Radiation Oncology, Stanford University School of Medicine, Stanford, CA, USA
| | - Xiaojian Wu
- Department of Colorectal Surgery, The Sixth Affiliated Hospital, Sun Yat-sen University, Guangzhou, China. .,Guangdong Institute of Gastroenterology, Guangdong Provincial Key Laboratory of Colorectal and Pelvic Floor Diseases, Guangzhou, China.
| | - Feng Gao
- Department of Colorectal Surgery, The Sixth Affiliated Hospital, Sun Yat-sen University, Guangzhou, China. .,Guangdong Institute of Gastroenterology, Guangdong Provincial Key Laboratory of Colorectal and Pelvic Floor Diseases, Guangzhou, China.
| | - Ruijiang Li
- Department of Radiation Oncology, Stanford University School of Medicine, Stanford, CA, USA.
| |
Collapse
|
97
|
Putzel P, Smyth P, Yu J, Zhong H. Dynamic Survival Analysis with Individualized Truncated Parametric Distributions. PROCEEDINGS OF MACHINE LEARNING RESEARCH 2021; 146:159-170. [PMID: 35372850 PMCID: PMC8969882] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Dynamic survival analysis is a variant of traditional survival analysis where time-to-event predictions are updated as new information arrives about an individual over time. In this paper we propose a new approach to dynamic survival analysis based on learning a global parametric distribution, followed by individualization via truncating and renormalizing that distribution at different locations over time. We combine this approach with a likelihood-based loss that includes predictions at every time step within an individual's history, rather than just including one term per individual. The combination of this loss and model results in an interpretable approach to dynamic survival, requiring less fine tuning than existing methods, while still achieving good predictive performance. We evaluate the approach on the problem of predicting hospital mortality for a dataset with over 6900 COVID-19 patients.
Collapse
Affiliation(s)
- Preston Putzel
- Department of Computer Science, University of California, Irvine, CA, USA
| | - Padhraic Smyth
- Department of Computer Science, University of California, Irvine, CA, USA
| | - Jaehong Yu
- Department of Industrial and Management Engineering, Incheon National University, 119 Academy-Ro, Yeonsu-Gu, Songdo-dong Incheon 22012, South Korea
| | - Hua Zhong
- Department of Population Health, NYU Grossman School of Medicine, New York, NY, USA
| |
Collapse
|
98
|
Cary MP, Zhuang F, Draelos RL, Pan W, Amarasekara S, Douthit BJ, Kang Y, Colón-Emeric CS. Machine Learning Algorithms to Predict Mortality and Allocate Palliative Care for Older Patients With Hip Fracture. J Am Med Dir Assoc 2021; 22:291-296. [PMID: 33132014 PMCID: PMC7867606 DOI: 10.1016/j.jamda.2020.09.025] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2020] [Revised: 09/18/2020] [Accepted: 09/21/2020] [Indexed: 12/21/2022]
Abstract
OBJECTIVES To evaluate a machine learning model designed to predict mortality for Medicare beneficiaries aged >65 years treated for hip fracture in Inpatient Rehabilitation Facilities (IRFs). DESIGN Retrospective design/cohort analysis of Centers for Medicare & Medicaid Services Inpatient Rehabilitation Facility-Patient Assessment Instrument data. SETTING AND PARTICIPANTS A total of 17,140 persons admitted to Medicare-certified IRFs in 2015 following hospitalization for hip fracture. MEASURES Patient characteristics include sociodemographic (age, gender, race, and social support) and clinical factors (functional status at admission, chronic conditions) and IRF length of stay. Outcomes were 30-day and 1-year all-cause mortality. We trained and evaluated 2 classification models, logistic regression and a multilayer perceptron (MLP), to predict the probability of 30-day and 1-year mortality and evaluated the calibration, discrimination, and precision of the models. RESULTS For 30-day mortality, MLP performed well [acc = 0.74, area under the receiver operating characteristic curve (AUROC) = 0.76, avg prec = 0.10, slope = 1.14] as did logistic regression (acc = 0.78, AUROC = 0.76, avg prec = 0.09, slope = 1.20). For 1-year mortality, the performances were similar for both MLP (acc = 0.68, AUROC = 0.75, avg prec = 0.32, slope = 0.96) and logistic regression (acc = 0.68, AUROC = 0.75, avg prec = 0.32, slope = 0.95). CONCLUSION AND IMPLICATIONS A scoring system based on logistic regression may be more feasible to run in current electronic medical records. But MLP models may reduce cognitive burden and increase ability to calibrate to local data, yielding clinical specificity in mortality prediction so that palliative care resources may be allocated more effectively.
Collapse
Affiliation(s)
- Michael P Cary
- School of Nursing, Duke University, Durham, NC, USA; Center for the Study of Aging and Human Development, Duke University, Durham, NC, USA.
| | - Farica Zhuang
- Department of Computer Science, Duke University, Durham, NC, USA
| | - Rachel Lea Draelos
- Department of Computer Science, Duke University, Durham, NC, USA; School of Medicine, Duke University, Durham, NC, USA
| | - Wei Pan
- School of Nursing, Duke University, Durham, NC, USA
| | | | | | - Yunah Kang
- School of Nursing, Duke University, Durham, NC, USA
| | - Cathleen S Colón-Emeric
- Center for the Study of Aging and Human Development, Duke University, Durham, NC, USA; School of Medicine, Duke University, Durham, NC, USA; Geriatric Research, Education and Clinical Center, Durham Veterans Affairs Medical Center, Durham, NC, USA
| |
Collapse
|
99
|
D'Ascenzo F, De Filippo O, Gallone G, Mittone G, Deriu MA, Iannaccone M, Ariza-Solé A, Liebetrau C, Manzano-Fernández S, Quadri G, Kinnaird T, Campo G, Simao Henriques JP, Hughes JM, Dominguez-Rodriguez A, Aldinucci M, Morbiducci U, Patti G, Raposeiras-Roubin S, Abu-Assi E, De Ferrari GM. Machine learning-based prediction of adverse events following an acute coronary syndrome (PRAISE): a modelling study of pooled datasets. Lancet 2021; 397:199-207. [PMID: 33453782 DOI: 10.1016/s0140-6736(20)32519-8] [Citation(s) in RCA: 179] [Impact Index Per Article: 44.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/28/2020] [Revised: 10/16/2020] [Accepted: 11/09/2020] [Indexed: 02/08/2023]
Abstract
BACKGROUND The accuracy of current prediction tools for ischaemic and bleeding events after an acute coronary syndrome (ACS) remains insufficient for individualised patient management strategies. We developed a machine learning-based risk stratification model to predict all-cause death, recurrent acute myocardial infarction, and major bleeding after ACS. METHODS Different machine learning models for the prediction of 1-year post-discharge all-cause death, myocardial infarction, and major bleeding (defined as Bleeding Academic Research Consortium type 3 or 5) were trained on a cohort of 19 826 adult patients with ACS (split into a training cohort [80%] and internal validation cohort [20%]) from the BleeMACS and RENAMI registries, which included patients across several continents. 25 clinical features routinely assessed at discharge were used to inform the models. The best-performing model for each study outcome (the PRAISE score) was tested in an external validation cohort of 3444 patients with ACS pooled from a randomised controlled trial and three prospective registries. Model performance was assessed according to a range of learning metrics including area under the receiver operating characteristic curve (AUC). FINDINGS The PRAISE score showed an AUC of 0·82 (95% CI 0·78-0·85) in the internal validation cohort and 0·92 (0·90-0·93) in the external validation cohort for 1-year all-cause death; an AUC of 0·74 (0·70-0·78) in the internal validation cohort and 0·81 (0·76-0·85) in the external validation cohort for 1-year myocardial infarction; and an AUC of 0·70 (0·66-0·75) in the internal validation cohort and 0·86 (0·82-0·89) in the external validation cohort for 1-year major bleeding. INTERPRETATION A machine learning-based approach for the identification of predictors of events after an ACS is feasible and effective. The PRAISE score showed accurate discriminative capabilities for the prediction of all-cause death, myocardial infarction, and major bleeding, and might be useful to guide clinical decision making. FUNDING None.
Collapse
Affiliation(s)
- Fabrizio D'Ascenzo
- Division of Cardiology, Cardiovascular and Thoracic Department, Città della Salute e della Scienza, Turin, Italy; Cardiology, Department of Medical Sciences, University of Turin, Turin, Italy.
| | - Ovidio De Filippo
- Division of Cardiology, Cardiovascular and Thoracic Department, Città della Salute e della Scienza, Turin, Italy; Cardiology, Department of Medical Sciences, University of Turin, Turin, Italy
| | - Guglielmo Gallone
- Division of Cardiology, Cardiovascular and Thoracic Department, Città della Salute e della Scienza, Turin, Italy; Cardiology, Department of Medical Sciences, University of Turin, Turin, Italy
| | - Gianluca Mittone
- Department of Computer Science, University of Turin, Turin, Italy
| | - Marco Agostino Deriu
- Polito BIO Med Lab, Department of Mechanical and Aerospace Engineering, Politecnico di Torino, Turin, Italy
| | | | - Albert Ariza-Solé
- Department of Cardiology, University Hospital de Bellvitge, Barcelona, Spain
| | | | | | - Giorgio Quadri
- Interventional Cardiology Unit, Degli Infermi Hospital, Turin, Italy
| | - Tim Kinnaird
- Cardiology Department, University Hospital of Wales, Cardiff, UK
| | - Gianluca Campo
- Azienda Ospedaliera Universitaria di Ferrara, Ferrara, Italy
| | | | | | | | - Marco Aldinucci
- Department of Computer Science, University of Turin, Turin, Italy
| | - Umberto Morbiducci
- Polito BIO Med Lab, Department of Mechanical and Aerospace Engineering, Politecnico di Torino, Turin, Italy
| | - Giuseppe Patti
- Catheterization Laboratory, Maggiore della Carità Hospital, Novara, Italy
| | | | - Emad Abu-Assi
- Department of Cardiology, University Hospital Álvaro Cunqueiro, Vigo, Spain
| | - Gaetano Maria De Ferrari
- Division of Cardiology, Cardiovascular and Thoracic Department, Città della Salute e della Scienza, Turin, Italy; Cardiology, Department of Medical Sciences, University of Turin, Turin, Italy
| |
Collapse
|
100
|
Hurley NC, Spatz ES, Krumholz HM, Jafari R, Mortazavi BJ. A Survey of Challenges and Opportunities in Sensing and Analytics for Risk Factors of Cardiovascular Disorders. ACM TRANSACTIONS ON COMPUTING FOR HEALTHCARE 2021; 2:9. [PMID: 34337602 PMCID: PMC8320445 DOI: 10.1145/3417958] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/01/2019] [Accepted: 08/01/2020] [Indexed: 10/22/2022]
Abstract
Cardiovascular disorders cause nearly one in three deaths in the United States. Short- and long-term care for these disorders is often determined in short-term settings. However, these decisions are made with minimal longitudinal and long-term data. To overcome this bias towards data from acute care settings, improved longitudinal monitoring for cardiovascular patients is needed. Longitudinal monitoring provides a more comprehensive picture of patient health, allowing for informed decision making. This work surveys sensing and machine learning in the field of remote health monitoring for cardiovascular disorders. We highlight three needs in the design of new smart health technologies: (1) need for sensing technologies that track longitudinal trends of the cardiovascular disorder despite infrequent, noisy, or missing data measurements; (2) need for new analytic techniques designed in a longitudinal, continual fashion to aid in the development of new risk prediction techniques and in tracking disease progression; and (3) need for personalized and interpretable machine learning techniques, allowing for advancements in clinical decision making. We highlight these needs based upon the current state of the art in smart health technologies and analytics. We then discuss opportunities in addressing these needs for development of smart health technologies for the field of cardiovascular disorders and care.
Collapse
|