1
|
Fridgeirsson EA, Williams R, Rijnbeek P, Suchard MA, Reps JM. Comparing penalization methods for linear models on large observational health data. J Am Med Inform Assoc 2024:ocae109. [PMID: 38767857 DOI: 10.1093/jamia/ocae109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Revised: 04/19/2024] [Accepted: 05/06/2024] [Indexed: 05/22/2024] Open
Abstract
OBJECTIVE This study evaluates regularization variants in logistic regression (L1, L2, ElasticNet, Adaptive L1, Adaptive ElasticNet, Broken adaptive ridge [BAR], and Iterative hard thresholding [IHT]) for discrimination and calibration performance, focusing on both internal and external validation. MATERIALS AND METHODS We use data from 5 US claims and electronic health record databases and develop models for various outcomes in a major depressive disorder patient population. We externally validate all models in the other databases. We use a train-test split of 75%/25% and evaluate performance with discrimination and calibration. Statistical analysis for difference in performance uses Friedman's test and critical difference diagrams. RESULTS Of the 840 models we develop, L1 and ElasticNet emerge as superior in both internal and external discrimination, with a notable AUC difference. BAR and IHT show the best internal calibration, without a clear external calibration leader. ElasticNet typically has larger model sizes than L1. Methods like IHT and BAR, while slightly less discriminative, significantly reduce model complexity. CONCLUSION L1 and ElasticNet offer the best discriminative performance in logistic regression for healthcare predictions, maintaining robustness across validations. For simpler, more interpretable models, L0-based methods (IHT and BAR) are advantageous, providing greater parsimony and calibration with fewer features. This study aids in selecting suitable regularization techniques for healthcare prediction models, balancing performance, complexity, and interpretability.
Collapse
Affiliation(s)
- Egill A Fridgeirsson
- Department of Medical Informatics, Erasmus University Medical Center, 3015 GD Rotterdam, The Netherlands
| | - Ross Williams
- Department of Medical Informatics, Erasmus University Medical Center, 3015 GD Rotterdam, The Netherlands
| | - Peter Rijnbeek
- Department of Medical Informatics, Erasmus University Medical Center, 3015 GD Rotterdam, The Netherlands
| | - Marc A Suchard
- Department of Biostatistics, University of California, Los Angeles, Los Angeles, CA 90095-1772, United States
- VA Informatics and Computing Infrastructure, United States Department of Veterans Affairs, Salt Lake City, UT 84148, United States
| | - Jenna M Reps
- Department of Medical Informatics, Erasmus University Medical Center, 3015 GD Rotterdam, The Netherlands
- Observational Health Data Analytics, Janssen Research and Development, Titusville, NJ 08560, United States
| |
Collapse
|
2
|
Han L, Chen X, Wang Y, Zhang R, Zhao T, Pu L, Huang Y, Sun H. A machine learning algorithm based on circulating metabolic biomarkers offers improved predictions of neurological diseases. Clin Chim Acta 2024; 558:119671. [PMID: 38621587 DOI: 10.1016/j.cca.2024.119671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2024] [Revised: 04/01/2024] [Accepted: 04/10/2024] [Indexed: 04/17/2024]
Abstract
BACKGROUND AND AIMS A machine learning algorithm based on circulating metabolic biomarkers for the predictions of neurological diseases (NLDs) is lacking. To develop a machine learning algorithm to compare the performance of a metabolic biomarker-based model with that of a clinical model based on conventional risk factors for predicting three NLDs: dementia, Parkinson's disease (PD), and Alzheimer's disease (AD). MATERIALS AND METHODS The eXtreme Gradient Boosting (XGBoost) algorithm was used to construct a metabolic biomarker-based model (metabolic model), a clinical risk factor-based model (clinical model), and a combined model for the prediction of the three NLDs. Risk discrimination (c-statistic), net reclassification improvement (NRI) index, and integrated discrimination improvement (IDI) index values were determined for each model. RESULTS The results indicate that incorporation of metabolic biomarkers into the clinical model afforded a model with improved performance in the prediction of dementia, AD, and PD, as demonstrated by NRI values of 0.159 (0.039-0.279), 0.113 (0.005-0.176), and 0.201 (-0.021-0.423), respectively; and IDI values of 0.098 (0.073-0.122), 0.070 (0.049-0.090), and 0.085 (0.068-0.101), respectively. CONCLUSION The performance of the model based on circulating NMR spectroscopy-detected metabolic biomarkers was better than that of the clinical model in the prediction of dementia, AD, and PD.
Collapse
Affiliation(s)
- Liyuan Han
- Key Laboratory of Diagnosis and Treatment of Digestive System Tumors of Zhejiang Province, Ningbo No 2 Hospital, Ningbo 315000, China; Center for Cardiovascular and Cerebrovascular Epidemiology and Translational Medicine, Ningbo Institute of Life and Health Industry, University of Chinese Academy of Sciences, Ningbo 315000, China
| | - Xi Chen
- Department of Economics, Yale University, USA; Yale Alzheimer's Disease Research Center, Yale University, USA
| | - Yue Wang
- School of Public Health, Medical College of Soochow University, Suzhou, Jiangsu province, China
| | - Ruijie Zhang
- Key Laboratory of Diagnosis and Treatment of Digestive System Tumors of Zhejiang Province, Ningbo No 2 Hospital, Ningbo 315000, China; Center for Cardiovascular and Cerebrovascular Epidemiology and Translational Medicine, Ningbo Institute of Life and Health Industry, University of Chinese Academy of Sciences, Ningbo 315000, China
| | - Tian Zhao
- Key Laboratory of Diagnosis and Treatment of Digestive System Tumors of Zhejiang Province, Ningbo No 2 Hospital, Ningbo 315000, China; Center for Cardiovascular and Cerebrovascular Epidemiology and Translational Medicine, Ningbo Institute of Life and Health Industry, University of Chinese Academy of Sciences, Ningbo 315000, China
| | - Liyuan Pu
- Key Laboratory of Diagnosis and Treatment of Digestive System Tumors of Zhejiang Province, Ningbo No 2 Hospital, Ningbo 315000, China; Center for Cardiovascular and Cerebrovascular Epidemiology and Translational Medicine, Ningbo Institute of Life and Health Industry, University of Chinese Academy of Sciences, Ningbo 315000, China
| | - Yi Huang
- Laboratory of Neurological Diseases and Brain Function, Department of Neurosur-gery, The First Affiliated Hospital of Ningbo University, Ningbo, Zhejiang 315010, China; Key Laboratory of Precision Medicine for Atherosclerotic Diseases of Zhejiang Province, Ningbo, Zhejiang 315010, China.
| | - Hongpeng Sun
- School of Public Health, Medical College of Soochow University, Suzhou, Jiangsu province, China.
| |
Collapse
|
3
|
Naderalvojoud B, Curtin CM, Yanover C, El-Hay T, Choi B, Park RW, Tabuenca JG, Reeve MP, Falconer T, Humphreys K, Asch SM, Hernandez-Boussard T. Towards global model generalizability: independent cross-site feature evaluation for patient-level risk prediction models using the OHDSI network. J Am Med Inform Assoc 2024; 31:1051-1061. [PMID: 38412331 PMCID: PMC11031239 DOI: 10.1093/jamia/ocae028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2023] [Revised: 01/26/2024] [Accepted: 02/01/2024] [Indexed: 02/29/2024] Open
Abstract
BACKGROUND Predictive models show promise in healthcare, but their successful deployment is challenging due to limited generalizability. Current external validation often focuses on model performance with restricted feature use from the original training data, lacking insights into their suitability at external sites. Our study introduces an innovative methodology for evaluating features during both the development phase and the validation, focusing on creating and validating predictive models for post-surgery patient outcomes with improved generalizability. METHODS Electronic health records (EHRs) from 4 countries (United States, United Kingdom, Finland, and Korea) were mapped to the OMOP Common Data Model (CDM), 2008-2019. Machine learning (ML) models were developed to predict post-surgery prolonged opioid use (POU) risks using data collected 6 months before surgery. Both local and cross-site feature selection methods were applied in the development and external validation datasets. Models were developed using Observational Health Data Sciences and Informatics (OHDSI) tools and validated on separate patient cohorts. RESULTS Model development included 41 929 patients, 14.6% with POU. The external validation included 31 932 (UK), 23 100 (US), 7295 (Korea), and 3934 (Finland) patients with POU of 44.2%, 22.0%, 15.8%, and 21.8%, respectively. The top-performing model, Lasso logistic regression, achieved an area under the receiver operating characteristic curve (AUROC) of 0.75 during local validation and 0.69 (SD = 0.02) (averaged) in external validation. Models trained with cross-site feature selection significantly outperformed those using only features from the development site through external validation (P < .05). CONCLUSIONS Using EHRs across four countries mapped to the OMOP CDM, we developed generalizable predictive models for POU. Our approach demonstrates the significant impact of cross-site feature selection in improving model performance, underscoring the importance of incorporating diverse feature sets from various clinical settings to enhance the generalizability and utility of predictive healthcare models.
Collapse
Affiliation(s)
| | - Catherine M Curtin
- Department of Surgery, Veterans Affairs Palo Alto Health Care System, Palo Alto, CA 94304, United States
| | - Chen Yanover
- KI Research Institute, Kfar Malal, 4592000, Israel
| | - Tal El-Hay
- KI Research Institute, Kfar Malal, 4592000, Israel
| | - Byungjin Choi
- Department of Biomedical Informatics, Ajou University Graduate School of Medicine, Suwon, 16499, Korea
| | - Rae Woong Park
- Department of Biomedical Informatics, Ajou University Graduate School of Medicine, Suwon, 16499, Korea
| | - Javier Gracia Tabuenca
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Helsinki, 00014, Finland
| | - Mary Pat Reeve
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Helsinki, 00014, Finland
| | - Thomas Falconer
- Department of Biomedical Informatics, Columbia University, New York, NY 10032, United States
| | - Keith Humphreys
- Department of Psychiatry and the Behavioral Sciences, Stanford University, Stanford, CA 94305, United States
- Center for Innovation to Implementation, Veterans Affairs Palo Alto Health Care System, Palo Alto, CA 94304, United States
| | - Steven M Asch
- Department of Medicine, Stanford University, Stanford, CA 94305, United States
- Center for Innovation to Implementation, Veterans Affairs Palo Alto Health Care System, Palo Alto, CA 94304, United States
| | | |
Collapse
|
4
|
Fridgeirsson EA, Sontag D, Rijnbeek P. Attention-based neural networks for clinical prediction modelling on electronic health records. BMC Med Res Methodol 2023; 23:285. [PMID: 38062352 PMCID: PMC10701944 DOI: 10.1186/s12874-023-02112-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Accepted: 11/27/2023] [Indexed: 12/18/2023] Open
Abstract
BACKGROUND Deep learning models have had a lot of success in various fields. However, on structured data they have struggled. Here we apply four state-of-the-art supervised deep learning models using the attention mechanism and compare against logistic regression and XGBoost using discrimination, calibration and clinical utility. METHODS We develop the models using a general practitioners database. We implement a recurrent neural network, a transformer with and without reverse distillation and a graph neural network. We measure discrimination using the area under the receiver operating characteristic curve (AUC) and the area under the precision recall curve (AUPRC). We assess smooth calibration using restricted cubic splines and clinical utility with decision curve analysis. RESULTS Our results show that deep learning approaches can improve discrimination up to 2.5% points AUC and 7.4% points AUPRC. However, on average the baselines are competitive. Most models are similarly calibrated as the baselines except for the graph neural network. The transformer using reverse distillation shows the best performance in clinical utility on two out of three prediction problems over most of the prediction thresholds. CONCLUSION In this study, we evaluated various approaches in supervised learning using neural networks and attention. Here we do a rigorous comparison, not only looking at discrimination but also calibration and clinical utility. There is value in using deep learning models on electronic health record data since it can improve discrimination and clinical utility while providing good calibration. However, good baseline methods are still competitive.
Collapse
Affiliation(s)
- Egill A Fridgeirsson
- Department of Medical Informatics, Erasmus University Medical Center, Doctor Molewaterplein 40, 3015 GD, Rotterdam, the Netherlands.
| | - David Sontag
- Institute for Medical Engineering & Science, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Peter Rijnbeek
- Department of Medical Informatics, Erasmus University Medical Center, Doctor Molewaterplein 40, 3015 GD, Rotterdam, the Netherlands
| |
Collapse
|
5
|
Tian M, Ma X, Liang M, Zang H. Application of Rapid Identification and Determination of Moisture Content of Coptidis Rhizoma From Different Species Based on Data Fusion. J AOAC Int 2023; 106:1389-1401. [PMID: 37171863 DOI: 10.1093/jaoacint/qsad058] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Revised: 04/25/2023] [Accepted: 05/08/2023] [Indexed: 05/13/2023]
Abstract
BACKGROUND For thousands of years, traditional Chinese medicine (TCM) has been clinically proven, and doctors have highly valued the differences in utility between different species. OBJECTIVE This study aims to replace the complex methods traditionally used for empirical identification by compensating for the information loss of a single sensor through data fusion. The research object of the study is Coptidis rhizoma (CR). METHOD Using spectral optimization and data fusion technology, near infrared (NIR) and mid-infrared (MIR) spectra were collected for CR. PLS-DA (n = 134) and PLSR (n = 63) models were established to identify the medicinal materials and to determine the moisture content in the medicinal materials. RESULTS For the identification of the three species of CR, the mid-level fusion model performed better than the single-spectrum model. The sensitivity and specificity of the prediction set coefficients for NIR, MIR, and data fusion qualitative models were all higher than 0.95, with an AUC value of 1. The NIR data model was superior to the MIR data model. The results of low-level fusion were similar to those of the NIR optimization model. The RPD of the test set of NIR and low-level fusion model was 3.6420 and 3.4216, respectively, indicating good prediction ability of the model. CONCLUSIONS Data fusion technology using NIR and MIR can be applied to identify CR species and to determine the moisture content of CR. It provides technical support for the rapid determination of moisture content, with a fast analysis speed and without the need for complex pretreatment methods. HIGHLIGHTS This study is the first to introduce spectral data fusion technology to identify CR species. Data fusion technology is feasible for multivariable calibration model performance and reduces the cost of manual identification. The moisture content of CR can be quickly evaluated, reducing the difficulty of traditional methods.
Collapse
Affiliation(s)
- Mengyin Tian
- Shandong University, NMPA Key Laboratory for Technology Research and Evaluation of Drug Products, Cheeloo College of Medicine, Jinan, Shandong 250012, China
- Shandong University, Key Laboratory of Chemical Biology (Ministry of Education), Jinan, Shandong 250012, China
| | - Xiaobo Ma
- Shandong University, NMPA Key Laboratory for Technology Research and Evaluation of Drug Products, Cheeloo College of Medicine, Jinan, Shandong 250012, China
- Shandong University, Key Laboratory of Chemical Biology (Ministry of Education), Jinan, Shandong 250012, China
| | - Mengying Liang
- Shandong University, NMPA Key Laboratory for Technology Research and Evaluation of Drug Products, Cheeloo College of Medicine, Jinan, Shandong 250012, China
- Shandong University, Key Laboratory of Chemical Biology (Ministry of Education), Jinan, Shandong 250012, China
| | - Hengchang Zang
- Shandong University, NMPA Key Laboratory for Technology Research and Evaluation of Drug Products, Cheeloo College of Medicine, Jinan, Shandong 250012, China
- Shandong University, Key Laboratory of Chemical Biology (Ministry of Education), Jinan, Shandong 250012, China
- Shandong University, National Glycoengineering Research Center, Jinan, Shandong 250012, China
| |
Collapse
|
6
|
Fehr J, Piccininni M, Kurth T, Konigorski S. Assessing the transportability of clinical prediction models for cognitive impairment using causal models. BMC Med Res Methodol 2023; 23:187. [PMID: 37598141 PMCID: PMC10439645 DOI: 10.1186/s12874-023-02003-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Accepted: 07/27/2023] [Indexed: 08/21/2023] Open
Abstract
BACKGROUND Machine learning models promise to support diagnostic predictions, but may not perform well in new settings. Selecting the best model for a new setting without available data is challenging. We aimed to investigate the transportability by calibration and discrimination of prediction models for cognitive impairment in simulated external settings with different distributions of demographic and clinical characteristics. METHODS We mapped and quantified relationships between variables associated with cognitive impairment using causal graphs, structural equation models, and data from the ADNI study. These estimates were then used to generate datasets and evaluate prediction models with different sets of predictors. We measured transportability to external settings under guided interventions on age, APOE ε4, and tau-protein, using performance differences between internal and external settings measured by calibration metrics and area under the receiver operating curve (AUC). RESULTS Calibration differences indicated that models predicting with causes of the outcome were more transportable than those predicting with consequences. AUC differences indicated inconsistent trends of transportability between the different external settings. Models predicting with consequences tended to show higher AUC in the external settings compared to internal settings, while models predicting with parents or all variables showed similar AUC. CONCLUSIONS We demonstrated with a practical prediction task example that predicting with causes of the outcome results in better transportability compared to anti-causal predictions when considering calibration differences. We conclude that calibration performance is crucial when assessing model transportability to external settings.
Collapse
Affiliation(s)
- Jana Fehr
- Digital Engineering Faculty, University of Potsdam, Potsdam, Germany.
- Digital Health and Machine Learning, Hasso-Plattner-Institute, Potsdam, Germany.
| | - Marco Piccininni
- Institute of Public Health, Charité - Universitätsmedizin Berlin, Berlin, Germany
- Center for Stroke Research Berlin, Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Tobias Kurth
- Institute of Public Health, Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Stefan Konigorski
- Digital Engineering Faculty, University of Potsdam, Potsdam, Germany.
- Digital Health and Machine Learning, Hasso-Plattner-Institute, Potsdam, Germany.
- Icahn School of Medicine at Mount Sinai, Hasso Plattner Institute for Digital Health at Mount Sinai, New York, NY, USA.
| |
Collapse
|