1
|
Hong T, Xie S, Liu X, Wu J, Chen G. Do Machine Learning Approaches Perform Better Than Regression Models in Mapping Studies? A Systematic Review. VALUE IN HEALTH : THE JOURNAL OF THE INTERNATIONAL SOCIETY FOR PHARMACOECONOMICS AND OUTCOMES RESEARCH 2025; 28:800-811. [PMID: 39922301 DOI: 10.1016/j.jval.2024.12.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/01/2024] [Revised: 12/06/2024] [Accepted: 12/16/2024] [Indexed: 02/10/2025]
Abstract
OBJECTIVES To identify how machine learning (ML) approaches were implemented in mapping studies and to determine the extent to which ML improved performance compared with regression models (RMs). METHODS A systematic literature search was conducted in 12 databases from inception to December 2023 to identify studies that applied ML to develop mapping algorithms. A data template was applied to extract data set information, source and target measures, ML approaches and RMs, mapping types (direct vs indirect), goodness-of-fit indicators (mean absolute error, mean squared error, root mean squared error, R-squared, and intraclass correlation coefficient), and validation methods. Differences in goodness-of-fit indicators between ML and RMs were summarized. Potential advantages and challenges for ML were further discussed. RESULTS Thirteen mapping studies were identified, in which both ML and RM were adopted. Bayesian networks were the most frequently used ML approach (n = 6), followed by the least absolute shrinkage and selection operator (n = 4). The ordinary least square model was the most used RM (n = 8), followed by the censored least absolute deviation and multinomial logit models (n = 5 each). The average improvement in the goodness-of-fit of ML compared with that of RMs by indicators were 0.007 (mean absolute error), 0.004 (mean squared error), 0.058 (R-squared), 0.016 (intraclass correlation coefficient), and -0.0004 (root mean squared error). CONCLUSIONS There is an increasing number of studies using ML in developing mapping algorithms. Generally, a minor improvement of goodness-of-fit was observed compared with RMs when using mean-based comparisons. Issues such as how to interpret, apply, and externally validate the ML-based outputs would affect their implementation. Future studies are warranted to verify advantages of ML approaches.
Collapse
Affiliation(s)
- Tianqi Hong
- School of Biomedical Engineering, McMaster University, Hamilton, ON, Canada
| | - Shitong Xie
- School of Pharmaceutical Science and Technology, Tianjin University, Tianjin, China; Center for Social Science Survey and Data, Tianjin University, Tianjin, China
| | - Xinran Liu
- School of Pharmaceutical Science and Technology, Tianjin University, Tianjin, China; Center for Social Science Survey and Data, Tianjin University, Tianjin, China
| | - Jing Wu
- School of Pharmaceutical Science and Technology, Tianjin University, Tianjin, China; Center for Social Science Survey and Data, Tianjin University, Tianjin, China.
| | - Gang Chen
- Centre for Health Economics, Monash Business School, Monash University, Melbourne, VIC, Australia; Melbourne School of Population and Global Health, University of Melbourne, Melbourne, VIC, Australia
| |
Collapse
|
2
|
Davies A, Zamora-Talaya B, Sabharwal S, Liddle AD, Vella-Baldacchino M, Rangan A, Reilly P. Cost-effectiveness of total shoulder arthroplasty compared with hemiarthroplasty: a study using data from the National Joint Registry. BMJ Open 2025; 15:e086150. [PMID: 40107707 PMCID: PMC11927414 DOI: 10.1136/bmjopen-2024-086150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/10/2024] [Accepted: 02/11/2025] [Indexed: 03/22/2025] Open
Abstract
OBJECTIVES The aim of this study was to compare the cost-effectiveness of total shoulder arthroplasty (TSA) and hemiarthroplasty (HA) and explore variation by age and gender. DESIGN Cost-effectiveness analysis using a lifetime cohort Markov model. SETTING National population registry data. PARTICIPANTS Model parameters were informed by propensity score-matched comparisons of TSA and HA in patients with osteoarthritis and an intact rotator cuff using data from the National Joint Registry. INTERVENTIONS TSA and HA. PRIMARY OUTCOME MEASURES Quality-adjusted life years (QALYs) and healthcare costs for age and gender subgroups. A probabilistic sensitivity analysis was performed. RESULTS In all subgroups, TSA was more cost-effective, with the probability of being cost-effective about 70% for TSA versus 30% for HA at any willingness-to-pay threshold above £1100 per QALY. TSA was dominant in young patients (≤60 years) with a mean cost saving of £463 in men and £658 in women, and a mean QALY gain of 2 in both men and women. In patients aged 61-75 years, there was a mean cost saving following HA of £395 in men and £181 in women, while QALYs remained superior following TSA with a 1.3 gain in men and 1.4 in women. In the older cohort (> 75 years), the cost difference was highest and the QALY difference was lowest; there was a cost-saving following HA of £905 in men and £966 in women. The mean QALY gain remained larger after TSA: 0.7 in men and 0.9 in women. CONCLUSION TSA was more cost-effective than HA in patients with osteoarthritis. QALYs were superior following TSA in all patient groups. Cost differences varied by age and TSA was dominant in young patients.
Collapse
Affiliation(s)
| | | | - Sanjeeve Sabharwal
- Trauma and Orthopaedics, Imperial College Healthcare NHS Trust, London, UK
| | - Alexander D Liddle
- MSK Lab, Department of Surgery and Cancer, Imperial College London, London, UK
| | | | - Amar Rangan
- Trauma and Orthopaedics, The James Cook University Hospital, Middlesbrough, UK
- Department of Health Sciences, University of York, York, UK
| | - Peter Reilly
- Bioengineering, Imperial College London, London, UK
- Trauma and Orthopaedics, Imperial College Healthcare NHS Trust, London, UK
| |
Collapse
|
3
|
Tan YJ, Ong SC. Direct and Indirect Mapping of Assessment of Quality of Life - 6 Dimensions (AQoL-6D) Onto EQ-5D-5L Utilities Using Data From a Multicenter, Cross-Sectional Study of Malaysians With Chronic Heart Failure. VALUE IN HEALTH : THE JOURNAL OF THE INTERNATIONAL SOCIETY FOR PHARMACOECONOMICS AND OUTCOMES RESEARCH 2024; 27:1762-1770. [PMID: 39127252 DOI: 10.1016/j.jval.2024.07.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/04/2024] [Revised: 06/19/2024] [Accepted: 07/25/2024] [Indexed: 08/12/2024]
Abstract
OBJECTIVES The Assessment of Quality of Life - 6 Dimensions (AQoL-6D), a generic preference-based measure, is an appealing alternative to EQ-5D-5L for assessing health status in patients with chronic heart failure (HF), given its expanded scope. However, without a Malaysian value set, the AQoL-6D cannot generate health state utility values (HSUVs) to support local economic evaluations. This study intended to develop algorithms for predicting EQ-5D-5L HSUVs from AQoL-6D in an HF population. METHODS Cross-sectional data from a multicenter cohort of 419 HF outpatients were used. Both direct and indirect mapping approaches were attempted using 5 sets of explanatory variables and 8 models (ordinary least squares, Tobit, censored least absolute deviations, generalized linear model, 2-part model [TPM], beta regression-based model, adjusted limited dependent variable mixture model, and multinomial ordinal regression [MLOGIT]). The models' predictive performance was assessed through 10-fold cross-validated mean absolute error [MAE] and root mean squared error [RMSE]). Potential prediction bias was also examined graphically. The best-performing models, with the lowest RMSE and no bias, were then identified. RESULTS Among the models evaluated, TPM, which included age, sex, and 5 AQoL-6D dimension scores as predictors, appears to be the best-performing model for directly predicting EQ-5D-5L HSUVs from AQoL-6D. TPM yielded the lowest MAE (0.0802) and RMSE (0.1116), and demonstrated predictive accuracy for HSUVs >0.2 without significant bias. A MLOGIT model developed for response mapping had suboptimal predictive accuracy. CONCLUSIONS This study developed potentially useful mapping algorithms for generating Malaysian EQ-5D-5L HSUVs from AQoL-6D responses among patients with HF when direct EQ-5D-5L data are unavailable.
Collapse
Affiliation(s)
- Yi Jing Tan
- Discipline of Social and Administrative Pharmacy, School of Pharmaceutical Sciences, Universiti Sains Malaysia, Penang, Malaysia; Seri Manjung Hospital, Ministry of Health Malaysia, Seri Manjung, Malaysia
| | - Siew Chin Ong
- Discipline of Social and Administrative Pharmacy, School of Pharmaceutical Sciences, Universiti Sains Malaysia, Penang, Malaysia.
| |
Collapse
|
4
|
Ackerman IN, Soh SE, Hallstrom BR, Fang YY, Franklin P, Lützner J, Ingelsrud LH. A systematic review of crosswalks for converting patient-reported outcome measure scores in hip, knee, and shoulder replacement surgery. Acta Orthop 2024; 95:512-523. [PMID: 39268815 PMCID: PMC11494241 DOI: 10.2340/17453674.2024.41384] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/18/2024] [Accepted: 08/05/2024] [Indexed: 09/15/2024] Open
Abstract
BACKGROUND AND PURPOSE We aimed to systematically review studies of crosswalks for converting patient-reported outcome measure (PROM) scores used in joint replacement, and develop a database of published crosswalks. METHODS 4 electronic databases were searched from January 2000 to May 2023 to identify studies reporting the development and/or validation of crosswalks to convert PROM scores in patients undergoing elective hip, knee, or shoulder replacement surgery. Data on study and sample characteristics, source and target PROMs, and crosswalk development and validation methods were extracted from eligible studies. Study reporting was evaluated using the Mapping onto Preference-based measures reporting Standards (MAPS) checklist. RESULTS 17 studies describing 35 crosswalks were eligible for inclusion. Unidirectional crosswalks were available to convert hip-specific (Oxford Hip Score [OHS]) and knee-specific (Oxford Knee Score [OKS]) scores to the EQ-5D-3L/EQ-5D-5L. Similar crosswalks to convert disease-specific scores (WOMAC) to the EQ-5D-3L, EQ-5D-5L, and ICECAP-O Capability Index were identified. Bidirectional crosswalks for converting OHS and OKS to the HOOS-JR/HOOS-12 and KOOS-JR/KOOS-12, for converting WOMAC to the HOOS-JR/KOOS-JR, and for converting HOOS-Function/KOOS-Function to the PROMIS-Physical Function were also available. Additionally, crosswalks to convert generic PROM scores from the UCLA Activity Scale to the Lower Extremity Activity Scale in both directions were available. No crosswalks were identified for converting scores in shoulder replacement. Development methods varied with the type of target score; most studies used regression, item response theory, or equipercentile equating approaches. Reporting quality was variable, particularly for methods and results items, impacting crosswalk application. CONCLUSION This is the first synthesis of published crosswalks for converting joint-specific (OHS, OKS, HOOS, KOOS), disease-specific (WOMAC), and generic PROMs scores (PROMIS-Physical Function, UCLA Activity Scale, Lower Extremity Activity Scale) used to assess joint replacement outcomes, providing a resource for data harmonization and pooled analysis. Crosswalks were developed using regression methods (9 studies), equipercentile equating methods (5 studies), a combination of equipercentile equating and item response theory methods (2 studies), and a combination of regression and equipercentile equating methods (1 study). A range of crosswalk validation approaches were adopted, including the use of external datasets, separate samples or subsets, follow-up data from additional time points, or bootstrapped samples. Efforts are needed to standardize crosswalk methodology and achieve consistent reporting.
Collapse
Affiliation(s)
- Ilana N Ackerman
- School of Public Health and Preventive Medicine, Monash University, Melbourne, Australia
| | - Sze-Ee Soh
- School of Public Health and Preventive Medicine, Monash University, Melbourne, Australia; School of Primary and Allied Health Care, Monash University, Melbourne, Australia
| | - Brian R Hallstrom
- Department of Orthopaedic Surgery, University of Michigan, Ann Arbor, USA
| | - Yi Ying Fang
- School of Public Health and Preventive Medicine, Monash University, Melbourne, Australia
| | - Patricia Franklin
- Departments of Medical Social Sciences, Orthopedics, and Medicine (Rheumatology), Northwestern University Feinberg School of Medicine, Chicago, USA
| | - Jörg Lützner
- University Center of Orthopaedic, Trauma and Plastic Surgery, University Hospital Carl Gustav Carus, TU Dresden, Dresden, Germany
| | - Lina Holm Ingelsrud
- Department of Orthopaedic Surgery, Copenhagen University Hospital, Hvidovre, Denmark
| |
Collapse
|
5
|
Senanayake S, Uchil R, Sharma P, Parsonage W, Kularatna S. Mapping Kansas City cardiomyopathy, Seattle Angina, and minnesota living with heart failure to the MacNew-7D in patients with heart disease. Qual Life Res 2024; 33:2151-2163. [PMID: 38839680 PMCID: PMC11286692 DOI: 10.1007/s11136-024-03676-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/01/2024] [Indexed: 06/07/2024]
Abstract
INTRODUCTION The Kansas City Cardiomyopathy Questionnaire (KCCQ), Seattle Angina Questionnaire (SAQ), and Minnesota Living with Heart Failure Questionnaire (MLHFQ) are widely used non-preference-based instruments that measure health-related quality of life (QOL) in people with heart disease. However, currently it is not possible to estimate quality-adjusted life-years (QALYs) for economic evaluation using these instruments as the summary scores produced are not preference-based. The MacNew-7D is a heart disease-specific preference-based instrument. This study provides different mapping algorithms for allocating utility scores to KCCQ, MLHFQ, and SAQ from MacNew-7D to calculate QALYs for economic evaluations. METHODS The study included 493 participants with heart failure or angina who completed the KCCQ, MLHFQ, SAQ, and MacNew-7D questionnaires. Regression techniques, namely, Gamma Generalized Linear Model (GLM), Bayesian GLM, Linear regression with stepwise selection and Random Forest were used to develop direct mapping algorithms. Cross-validation was employed due to the absence of an external validation dataset. The study followed the Mapping onto Preference-based measures reporting Standards checklist. RESULTS The best models to predict MacNew-7D utility scores were determined using KCCQ, MLHFQ, and SAQ item and domain scores. Random Forest performed well for item scores for all questionnaires and domain score for KCCQ, while Bayesian GLM and Linear Regression were best for MLHFQ and SAQ domain scores. However, models tended to over-predict severe health states. CONCLUSION The three cardiac-specific non-preference-based QOL instruments can be mapped onto MacNew-7D utilities with good predictive accuracy using both direct response mapping techniques. The reported mapping algorithms may facilitate estimation of health utility for economic evaluations that have used these QOL instruments.
Collapse
Affiliation(s)
- Sameera Senanayake
- Health Services and Systems Research, Duke-NUS Medical School, Singapore, Singapore
- Australian Centre for Health Services Innovation and Centre for Healthcare Transformation, School of Public health and Social Work, Faculty of Health, Queensland University of Technology, Brisbane, QLD, 4059, Australia
- National Heart Research Institute Singapore, National Heart Centre Singapore, Singapore, Singapore
| | - Rithika Uchil
- Australian Centre for Health Services Innovation and Centre for Healthcare Transformation, School of Public health and Social Work, Faculty of Health, Queensland University of Technology, Brisbane, QLD, 4059, Australia
| | - Pakhi Sharma
- Australian Centre for Health Services Innovation and Centre for Healthcare Transformation, School of Public health and Social Work, Faculty of Health, Queensland University of Technology, Brisbane, QLD, 4059, Australia.
| | - William Parsonage
- Australian Centre for Health Services Innovation and Centre for Healthcare Transformation, School of Public health and Social Work, Faculty of Health, Queensland University of Technology, Brisbane, QLD, 4059, Australia
- Royal Brisbane and Women's Hospital, Metro North Health, Brisbane, QLD, Australia
| | - Sanjeewa Kularatna
- Health Services and Systems Research, Duke-NUS Medical School, Singapore, Singapore
- Australian Centre for Health Services Innovation and Centre for Healthcare Transformation, School of Public health and Social Work, Faculty of Health, Queensland University of Technology, Brisbane, QLD, 4059, Australia
- National Heart Research Institute Singapore, National Heart Centre Singapore, Singapore, Singapore
| |
Collapse
|
6
|
Xie S, Wu J, Chen G. Comparative performance and mapping algorithms between EQ-5D-5L and SF-6Dv2 among the Chinese general population. THE EUROPEAN JOURNAL OF HEALTH ECONOMICS : HEPAC : HEALTH ECONOMICS IN PREVENTION AND CARE 2024; 25:7-19. [PMID: 36709458 DOI: 10.1007/s10198-023-01566-x] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/21/2022] [Accepted: 01/11/2023] [Indexed: 06/18/2023]
Abstract
OBJECTIVES To explore the comparative performance and develop the mapping algorithms between EQ-5D-5L and SF-6Dv2 in China. METHODS Respondents recruited from the Chinese general population completed both EQ-5D-5L and SF-6Dv2 during face-to-face interviews. Ceiling/floor effects were reported. Discriminative validity in self-reported chronic conditions was investigated using the effect sizes (ES). Test-retest reliability was evaluated using intra-class correlation coefficient (ICC) and Bland-Altman plots in a subsample. Correlation and absolute agreements between the two measures were estimated with Spearman's rank correlation coefficient and ICC, respectively. Ordinary least squares (OLS), generalized linear model, Tobit model, and robust MM-estimator were explored to estimate mapping equations between EQ-5D-5L and SF-6Dv2. RESULTS 3320 respondents (50.3% males; age 18-90 years) were recruited. 51.1% and 12.2% of respondents reported no problems on all EQ-5D-5L and SF-6Dv2 dimensions, respectively. The mean EQ-5D-5L utility was higher than SF-6Dv2 (0.947 vs. 0.827, p < 0.001). Utilities were significantly different across all chronic conditions groups for both measures. The mean absolute difference of utilities between the two tests for EQ-5D-5L was smaller (0.033 vs. 0.043) than SF-6Dv2, with a slightly higher ICC (0.859 vs. 0.827). Fair agreement (ICC = 0.582) was observed in the utilities between the two measures. Mapping algorithms generated by the OLS models performed the best according to the goodness-of-fit indicators. CONCLUSIONS Both measures showed comparable discriminative validity. Systematic differences in utilities were found, and on average, the EQ-5D-5L generates higher values than the SF-6Dv2. Mapping algorithms between the EQ-5D-5L and SF-6Dv2 are reported to enable transformations between these two measures in China.
Collapse
Affiliation(s)
- Shitong Xie
- School of Pharmaceutical Science and Technology, Tianjin University, Tianjin, China
- Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, ON, Canada
| | - Jing Wu
- School of Pharmaceutical Science and Technology, Tianjin University, Tianjin, China.
- Center for Social Science Survey and Data, Tianjin University, Tianjin, China.
| | - Gang Chen
- Centre for Health Economics, Monash Business School, Monash University, Melbourne, VIC, Australia.
| |
Collapse
|
7
|
Ho KKW, Chau WW, Lau LCM, Ng JP, Chiu KH, Ong MTY. Long-term survivorship and results in lower limb arthroplasty: a registry-based comparison study. BMC Musculoskelet Disord 2023; 24:307. [PMID: 37076860 PMCID: PMC10113734 DOI: 10.1186/s12891-023-06398-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/29/2022] [Accepted: 04/04/2023] [Indexed: 04/21/2023] Open
Abstract
INTRODUCTION Popularity of joint replacement surgery due to ever aging population surges the demand for a proper national joint registry. Our Chinese University of Hong Kong - Prince of Wales Hospital (CUHK-PWH) joint registry has passed the 30th year. The aims of this study are 1) summarize our territory-wide joint registry which has passed the 30th year since establishment and 2) compare our statistics with other major joint registries. METHODS Part 1 was to review the CUHK-PWH registry. Demographic characteristics of our patients who underwent knee and hip replacements had been summarized. Part 2 was a series of comparisons with registries from Sweden, UK, Australia and New Zealand. RESULTS CUHK-PWH registry captured 2889 primary total knee replacements (TKR) (110 (3.81%) revision) and 879 primary total hip replacements (THR) (107 (12.17%) revision). Median Surgery time of TKR was shorter than THR. Clinical outcome scores were much improved after surgery in both. Uncemented of hybrid in TKR were most popular in Australia (33.4%) and 40% in Sweden and UK. More than half of TKR and THR patients showed the highest percentage with ASA grade 2. New Zealand reflected the best cumulative percentage survival 20 years after surgery of 92.2%, 76.0%, 84.2% survivorship 20 years after TKR, unicompartmental knee replacement (UKR) and Hip. CONCLUSION A worldwide accepted patient-reported outcome measure (PROM) is recommended to develop to make comparisons among registries and studies feasible. Completeness of registry data is important and useful to improve surgical performance through data comparisons from different regions. Funding from government on sustaining registries is reflected. Registries from Asian countries have yet to be grown and reported.
Collapse
Affiliation(s)
- Kevin Ki-Wai Ho
- Department of Orthopaedics and Traumatology, Chinese University of Hong Kong, Prince of Wales Hospital, Shatin, Hong Kong SAR, China.
| | - Wai-Wang Chau
- Department of Orthopaedics and Traumatology, Chinese University of Hong Kong, Prince of Wales Hospital, Shatin, Hong Kong SAR, China
| | - Lawrence Chun-Man Lau
- Department of Orthopaedics and Traumatology, Prince of Wales Hospital, Hong Kong SAR, China
| | - Jonathan Patrick Ng
- Department of Orthopaedics and Traumatology, Prince of Wales Hospital, Hong Kong SAR, China
| | - Kwok-Hing Chiu
- Department of Orthopaedics and Traumatology, Prince of Wales Hospital, Hong Kong SAR, China
| | - Michael Tim-Yun Ong
- Department of Orthopaedics and Traumatology, Chinese University of Hong Kong, Prince of Wales Hospital, Shatin, Hong Kong SAR, China
| |
Collapse
|