1
|
Otieno JA, Häggström J, Darehed D, Eriksson M. Developing machine learning models to predict multi-class functional outcomes and death three months after stroke in Sweden. PLoS One 2024; 19:e0303287. [PMID: 38739586 PMCID: PMC11090298 DOI: 10.1371/journal.pone.0303287] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Accepted: 04/23/2024] [Indexed: 05/16/2024] Open
Abstract
Globally, stroke is the third-leading cause of mortality and disability combined, and one of the costliest diseases in society. More accurate predictions of stroke outcomes can guide healthcare organizations in allocating appropriate resources to improve care and reduce both the economic and social burden of the disease. We aim to develop and evaluate the performance and explainability of three supervised machine learning models and the traditional multinomial logistic regression (mLR) in predicting functional dependence and death three months after stroke, using routinely-collected data. This prognostic study included adult patients, registered in the Swedish Stroke Registry (Riksstroke) from 2015 to 2020. Riksstroke contains information on stroke care and outcomes among patients treated in hospitals in Sweden. Prognostic factors (features) included demographic characteristics, pre-stroke functional status, cardiovascular risk factors, medications, acute care, stroke type, and severity. The outcome was measured using the modified Rankin Scale at three months after stroke (a scale of 0-2 indicates independent, 3-5 dependent, and 6 dead). Outcome prediction models included support vector machines, artificial neural networks (ANN), eXtreme Gradient Boosting (XGBoost), and mLR. The models were trained and evaluated on 75% and 25% of the dataset, respectively. Model predictions were explained using SHAP values. The study included 102,135 patients (85.8% ischemic stroke, 53.3% male, mean age 75.8 years, and median NIHSS of 3). All models demonstrated similar overall accuracy (69%-70%). The ANN and XGBoost models performed significantly better than the mLR in classifying dependence with F1-scores of 0.603 (95% CI; 0.594-0.611) and 0.577 (95% CI; 0.568-0.586), versus 0.544 (95% CI; 0.545-0.563) for the mLR model. The factors that contributed most to the predictions were expectedly similar in the models, based on clinical knowledge. Our ANN and XGBoost models showed a modest improvement in prediction performance and explainability compared to mLR using routinely-collected data. Their improved ability to predict functional dependence may be of particular importance for the planning and organization of acute stroke care and rehabilitation.
Collapse
Affiliation(s)
| | - Jenny Häggström
- Department of Statistics, USBE, Umeå University, Umeå, Sweden
| | - David Darehed
- Department of Public Health and Clinical Medicine, Sunderby Research Unit, Umeå University, Umeå, Sweden
| | - Marie Eriksson
- Department of Statistics, USBE, Umeå University, Umeå, Sweden
| |
Collapse
|
2
|
Collins GS, Moons KGM, Dhiman P, Riley RD, Beam AL, Van Calster B, Ghassemi M, Liu X, Reitsma JB, van Smeden M, Boulesteix AL, Camaradou JC, Celi LA, Denaxas S, Denniston AK, Glocker B, Golub RM, Harvey H, Heinze G, Hoffman MM, Kengne AP, Lam E, Lee N, Loder EW, Maier-Hein L, Mateen BA, McCradden MD, Oakden-Rayner L, Ordish J, Parnell R, Rose S, Singh K, Wynants L, Logullo P. TRIPOD+AI statement: updated guidance for reporting clinical prediction models that use regression or machine learning methods. BMJ 2024; 385:e078378. [PMID: 38626948 PMCID: PMC11019967 DOI: 10.1136/bmj-2023-078378] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 01/17/2024] [Indexed: 04/19/2024]
Affiliation(s)
- Gary S Collins
- Centre for Statistics in Medicine, UK EQUATOR Centre, Nuffield Department of Orthopaedics, Rheumatology, and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK
| | - Karel G M Moons
- Julius Centre for Health Sciences and Primary Care, University Medical Centre Utrecht, Utrecht University, Utrecht, Netherlands
| | - Paula Dhiman
- Centre for Statistics in Medicine, UK EQUATOR Centre, Nuffield Department of Orthopaedics, Rheumatology, and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK
| | - Richard D Riley
- Institute of Applied Health Research, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK
- National Institute for Health and Care Research (NIHR) Birmingham Biomedical Research Centre, Birmingham, UK
| | - Andrew L Beam
- Department of Epidemiology, Harvard T H Chan School of Public Health, Boston, MA, USA
| | - Ben Van Calster
- Department of Development and Regeneration, KU Leuven, Leuven, Belgium
- Department of Biomedical Data Science, Leiden University Medical Centre, Leiden, Netherlands
| | - Marzyeh Ghassemi
- Department of Electrical Engineering and Computer Science, Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Xiaoxuan Liu
- Institute of Inflammation and Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK
- University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK
| | - Johannes B Reitsma
- Julius Centre for Health Sciences and Primary Care, University Medical Centre Utrecht, Utrecht University, Utrecht, Netherlands
| | - Maarten van Smeden
- Julius Centre for Health Sciences and Primary Care, University Medical Centre Utrecht, Utrecht University, Utrecht, Netherlands
| | - Anne-Laure Boulesteix
- Department of Medical Information Processing, Biometry and Epidemiology, Ludwig-Maximilians-University of Munich, Munich, Germany
| | - Jennifer Catherine Camaradou
- Patient representative, Health Data Research UK patient and public involvement and engagement group
- Patient representative, University of East Anglia, Faculty of Health Sciences, Norwich Research Park, Norwich, UK
| | - Leo Anthony Celi
- Beth Israel Deaconess Medical Center, Boston, MA, USA
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, MA, USA
- Department of Biostatistics, Harvard T H Chan School of Public Health, Boston, MA, USA
| | - Spiros Denaxas
- Institute of Health Informatics, University College London, London, UK
- British Heart Foundation Data Science Centre, London, UK
| | - Alastair K Denniston
- National Institute for Health and Care Research (NIHR) Birmingham Biomedical Research Centre, Birmingham, UK
- Institute of Inflammation and Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK
| | - Ben Glocker
- Department of Computing, Imperial College London, London, UK
| | - Robert M Golub
- Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | | | - Georg Heinze
- Section for Clinical Biometrics, Centre for Medical Data Science, Medical University of Vienna, Vienna, Austria
| | - Michael M Hoffman
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
- Vector Institute for Artificial Intelligence, Toronto, ON, Canada
| | | | - Emily Lam
- Patient representative, Health Data Research UK patient and public involvement and engagement group
| | - Naomi Lee
- National Institute for Health and Care Excellence, London, UK
| | - Elizabeth W Loder
- The BMJ, London, UK
- Department of Neurology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Lena Maier-Hein
- Department of Intelligent Medical Systems, German Cancer Research Centre, Heidelberg, Germany
| | - Bilal A Mateen
- Institute of Health Informatics, University College London, London, UK
- Wellcome Trust, London, UK
- Alan Turing Institute, London, UK
| | - Melissa D McCradden
- Department of Bioethics, Hospital for Sick Children Toronto, ON, Canada
- Genetics and Genome Biology, SickKids Research Institute, Toronto, ON, Canada
| | - Lauren Oakden-Rayner
- Australian Institute for Machine Learning, University of Adelaide, Adelaide, SA, Australia
| | - Johan Ordish
- Medicines and Healthcare products Regulatory Agency, London, UK
| | - Richard Parnell
- Patient representative, Health Data Research UK patient and public involvement and engagement group
| | - Sherri Rose
- Department of Health Policy and Center for Health Policy, Stanford University, Stanford, CA, USA
| | - Karandeep Singh
- Department of Epidemiology, CAPHRI Care and Public Health Research Institute, Maastricht University, Maastricht, Netherlands
| | - Laure Wynants
- Department of Epidemiology, CAPHRI Care and Public Health Research Institute, Maastricht University, Maastricht, Netherlands
| | - Patricia Logullo
- Centre for Statistics in Medicine, UK EQUATOR Centre, Nuffield Department of Orthopaedics, Rheumatology, and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK
| |
Collapse
|
3
|
Lin CH, Chen YA, Jeng JS, Sun Y, Wei CY, Yeh PY, Chang WL, Fann YC, Hsu KC, Lee JT. Predicting ischemic stroke patients' prognosis changes using machine learning in a nationwide stroke registry. Med Biol Eng Comput 2024:10.1007/s11517-024-03073-4. [PMID: 38575823 DOI: 10.1007/s11517-024-03073-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Accepted: 03/13/2024] [Indexed: 04/06/2024]
Abstract
Accurately predicting the prognosis of ischemic stroke patients after discharge is crucial for physicians to plan for long-term health care. Although previous studies have demonstrated that machine learning (ML) shows reasonably accurate stroke outcome predictions with limited datasets, to identify specific clinical features associated with prognosis changes after stroke that could aid physicians and patients in devising improved recovery care plans have been challenging. This study aimed to overcome these gaps by utilizing a large national stroke registry database to assess various prediction models that estimate how patients' prognosis changes over time with associated clinical factors. To properly evaluate the best predictive approaches currently available and avoid prejudice, this study employed three different prognosis prediction models including a statistical logistic regression model, commonly used clinical-based scores, and a latest high-performance ML-based XGBoost model. The study revealed that the XGBoost model outperformed other two traditional models, achieving an AUROC of 0.929 in predicting the prognosis changes of stroke patients followed for 3 months. In addition, the XGBoost model maintained remarkably high precision even when using only selected 20 most relevant clinical features compared to full clinical datasets used in the study. These selected features closely correlated with significant changes in clinical outcomes for stroke patients and showed to be effective for predicting prognosis changes after discharge, allowing physicians to make optimal decisions regarding their patients' recovery.
Collapse
Affiliation(s)
- Ching-Heng Lin
- Division of Intramural Research, Disorders and Stroke, National Institute of Neurological, National Institutes of Health, 9000 Rockville Pike, Bethesda, MD, 20892, USA
- Center for Artificial Intelligence in Medicine, Chang Gung Memorial Hospital, Taoyuan, Taiwan
- Bachelor Program in Artificial Intelligence, Chang Gung University, Taoyuan, Taiwan
| | - Yi-An Chen
- Center for Artificial Intelligence in Medicine, Chang Gung Memorial Hospital, Taoyuan, Taiwan
| | - Jiann-Shing Jeng
- Stroke Center and Department of Neurology, National Taiwan University Hospital, Taipei, Taiwan
| | - Yu Sun
- Department of Neurology, En Chu Kong Hospital, New Taipei City, Taiwan
| | - Cheng-Yu Wei
- Department of Exercise and Health Promotion, College of Kinesiology and Health, Chinese Culture University, Taipei, Taiwan
| | - Po-Yen Yeh
- Department of Neurology, St. Martin de Porres Hospital, Chiayi, Taiwan
| | - Wei-Lun Chang
- Department of Neurology, Show Chwan Memorial Hospital, Changhua County, Taiwan
| | - Yang C Fann
- Division of Intramural Research, Disorders and Stroke, National Institute of Neurological, National Institutes of Health, 9000 Rockville Pike, Bethesda, MD, 20892, USA.
| | - Kai-Cheng Hsu
- Department of Medicine, China Medical University, Taichung, Taiwan.
- Artificial Intelligence Center for Medical Diagnosis, China Medical University Hospital, No. 2, Yude Rd., North Dist., Taichung, 404332, Taiwan.
- Department of Neurology, China Medical University Hospital, Taichung, Taiwan.
| | - Jiunn-Tay Lee
- Department of Neurology, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan, Republic of China
| |
Collapse
|
4
|
Ullah N, Kiu Chou W, Vardanyan R, Arjomandi Rad A, Shah V, Torabi S, Avavde D, Airapetyan AA, Zubarevich A, Weymann A, Ruhparwar A, Miller G, Malawana J. Machine learning algorithms for the prognostication of abdominal aortic aneurysm progression: a systematic review. Minerva Surg 2024; 79:219-227. [PMID: 37987755 DOI: 10.23736/s2724-5691.23.10130-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2023]
Abstract
INTRODUCTION Abdominal aortic aneurysm (AAA), often characterized by an abdominal aortic diameter over 3.0 cm, is managed through screening, surveillance, and surgical intervention. AAA growth can be heterogeneous and rupture carries a high mortality rate, with size and certain risk factors influencing rupture risk. Research is ongoing to accurately predict individual AAA growth rates for personalized management. Machine learning, a subset of artificial intelligence, has shown promise in various medical fields, including endoleak detection post-EVAR. However, its application for predicting AAA growth remains insufficiently explored, thus necessitating further investigation. Subsequently, this paper aims to summarize the current status of machine learning in predicting AAA growth. EVIDENCE ACQUISITION A systematic database search of Embase, MEDLINE, Cochrane, PubMed and Google Scholar from inception till December 2022 was conducted of original articles that discussed the use of machine learning in predicting AAA growth using the aforementioned databases. EVIDENCE SYNTHESIS Overall, 2742 articles were extracted, of which seven retrospective studies involving 410 patients were included using a predetermined criteria. Six out of seven studies applied a supervised learning approach for their machine learning (ML) models, with considerable diversity observed within specific ML models. The majority of the studies concluded that machine learning models perform better in predicting AAA growth in comparison to reference models. All studies focused on predicting AAA growth over specified durations. Maximal luminal diameter was the most frequently used indicator, with alternative predictors being AAA volume, ILT (intraluminal thrombus) and flow-medicated diameter (FMD). CONCLUSIONS The nascent field of applying machine learning (ML) for Abdominal Aortic Aneurysm (AAA) expansion prediction exhibits potential to enhance predictive accuracy across diverse parameters. Future studies must emphasize evidencing clinical utility in a healthcare system context, thereby ensuring patient outcome improvement. This will necessitate addressing key ethical implications in establishing prospective studies related to this topic and collaboration among pivotal stakeholders within the AI field.
Collapse
Affiliation(s)
- Nazifa Ullah
- Faculty of Medicine, University College London, London, UK
| | - Wing Kiu Chou
- Norwich Medical School, University of East Anglia, Norwich, UK
| | - Robert Vardanyan
- Department of Medicine, Faculty of Medicine, Imperial College London, London, UK -
- Research Unit, The Healthcare Leadership Academy, London, UK
| | - Arian Arjomandi Rad
- Department of Medicine, Faculty of Medicine, Imperial College London, London, UK
- Research Unit, The Healthcare Leadership Academy, London, UK
- Medical Sciences Division, University of Oxford, Oxford, UK
| | - Viraj Shah
- Department of Medicine, Faculty of Medicine, Imperial College London, London, UK
| | - Saeed Torabi
- Department of Anesthesiology, University Hospital Cologne, Cologne, Germany
| | - Dani Avavde
- Department of Vascular Surgery, Nottingham University Hospitals NHS Trust, Nottingham, UK
| | - Arkady A Airapetyan
- Department of Research and Academia, National Institute of Health, Yerevan, Armenia
| | - Alina Zubarevich
- Department of Cardiothoracic Transplant and Vascular Surgery, Hannover Medical School, Hannover, Germany
| | - Alexander Weymann
- Department of Cardiothoracic Transplant and Vascular Surgery, Hannover Medical School, Hannover, Germany
| | - Arjang Ruhparwar
- Department of Cardiothoracic Transplant and Vascular Surgery, Hannover Medical School, Hannover, Germany
| | - George Miller
- Research Unit, The Healthcare Leadership Academy, London, UK
- Centre for Digital Health and Education Research (CoDHER), University of Central Lancashire Medical School, Preston, UK
| | - Johann Malawana
- Research Unit, The Healthcare Leadership Academy, London, UK
- Centre for Digital Health and Education Research (CoDHER), University of Central Lancashire Medical School, Preston, UK
| |
Collapse
|
5
|
Cohen-Cohen S, Jabal MS, Rinaldo L, Savastano LE, Lanzino G, Cloft H, Brinjikji W. Middle meningeal artery embolization for chronic subdural hematoma: A single-center experience and predictive modeling of outcomes. Neuroradiol J 2024; 37:192-198. [PMID: 38147825 DOI: 10.1177/19714009231224431] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2023] Open
Abstract
BACKGROUND Remarkable interest is rising around middle meningeal artery embolization (MMAE) as an emerging alternative therapy for chronic subdural hematoma (cSDH). The study aims to highlight a large center experience and the variables associated with treatment failure and build experimental machine learning (ML) models for outcome prediction. MATERIAL AND METHODS A 2-year experience in MMAE for managing patients with chronic subdural hematoma was analyzed. Descriptive statistical analysis was conducted using imaging and clinical features of the patients and cSDH, which were subsequently used to build predictive models for the procedure outcome. The modeling evaluation metrics were the area under the ROC curve and F1-score. RESULTS A total of 100 cSDH of 76 patients who underwent MMAE were included with an average follow-up of 6 months. The intervention had a per procedure success rate of 92%. Thrombocytopenia had a highly significant association with treatment failure. Two patients suffered a complication related to the procedure. The best performing machine learning models in predicting MMAE failure achieved an ROC-AUC of 70%, and an F1-score of 67%, including all patients with or without surgical intervention prior to embolization, and an ROC-AUC of 82% and an F1-score of 69% when only patients who underwent upfront MMAE were included. CONCLUSION MMAE is a safe and minimally invasive procedure with great potential in transforming the management of cSDH and reducing the risk of surgical complications in selected patients. An ML approach with larger sample size might help better predict outcomes and highlight important predictors following MMAE in patients with cSDH.
Collapse
|
6
|
Lee DY, Kim N, Park C, Gan S, Son SJ, Park RW, Park B. Explainable multimodal prediction of treatment-resistance in patients with depression leveraging brain morphometry and natural language processing. Psychiatry Res 2024; 334:115817. [PMID: 38430816 DOI: 10.1016/j.psychres.2024.115817] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 02/19/2024] [Accepted: 02/23/2024] [Indexed: 03/05/2024]
Abstract
Although 20 % of patients with depression receiving treatment do not achieve remission, predicting treatment-resistant depression (TRD) remains challenging. In this study, we aimed to develop an explainable multimodal prediction model for TRD using structured electronic medical record data, brain morphometry, and natural language processing. In total, 247 patients with a new depressive episode were included. TRD-predictive models were developed based on the combination of following parameters: selected tabular dataset features, independent components-map weightings from brain T1-weighted magnetic resonance imaging (MRI), and topic probabilities from clinical notes. All models applied the extreme gradient boosting (XGBoost) algorithm via five-fold cross-validation. The model using all data sources showed the highest area under the receiver operating characteristic of 0.794, followed by models that used combined brain MRI and structured data, brain MRI and clinical notes, clinical notes and structured data, brain MRI only, structured data only, and clinical notes only (0.770, 0.762, 0.728, 0.703, 0.684, and 0.569, respectively). Classifications of TRD were driven by several predictors, such as previous exposure to antidepressants and antihypertensive medications, sensorimotor network, default mode network, and somatic symptoms. Our findings suggest that a combination of clinical data with neuroimaging and natural language processing variables improves the prediction of TRD.
Collapse
Affiliation(s)
- Dong Yun Lee
- Department of Biomedical Informatics, Ajou University School of Medicine, Suwon, South Korea; Department of Medical Sciences, Graduate School of Ajou University, Suwon, South Korea
| | - Narae Kim
- Department of Biomedical Informatics, Ajou University School of Medicine, Suwon, South Korea; Department of Biomedical Sciences, Graduate School of Ajou University, Suwon, South Korea
| | - ChulHyoung Park
- Department of Biomedical Informatics, Ajou University School of Medicine, Suwon, South Korea; Department of Medical Sciences, Graduate School of Ajou University, Suwon, South Korea
| | - Sujin Gan
- Department of Biomedical Sciences, Graduate School of Ajou University, Suwon, South Korea
| | - Sang Joon Son
- Department of Psychiatry, Ajou University School of Medicine, Suwon, South Korea
| | - Rae Woong Park
- Department of Biomedical Informatics, Ajou University School of Medicine, Suwon, South Korea; Department of Biomedical Sciences, Graduate School of Ajou University, Suwon, South Korea.
| | - Bumhee Park
- Department of Biomedical Informatics, Ajou University School of Medicine, Suwon, South Korea; Office of Biostatistics, Medical Research Collaborating Center, Ajou Research Institute for Innovative Medicine, Ajou University Medical Center, Suwon, South Korea.
| |
Collapse
|
7
|
Axford D, Sohel F, Abedi V, Zhu Y, Zand R, Barkoudah E, Krupica T, Iheasirim K, Sharma UM, Dugani SB, Takahashi PY, Bhagra S, Murad MH, Saposnik G, Yousufuddin M. Development and internal validation of machine learning-based models and external validation of existing risk scores for outcome prediction in patients with ischaemic stroke. EUROPEAN HEART JOURNAL. DIGITAL HEALTH 2024; 5:109-122. [PMID: 38505491 PMCID: PMC10944684 DOI: 10.1093/ehjdh/ztad073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 10/14/2023] [Accepted: 10/30/2023] [Indexed: 03/21/2024]
Abstract
Aims We developed new machine learning (ML) models and externally validated existing statistical models [ischaemic stroke predictive risk score (iScore) and totalled health risks in vascular events (THRIVE) scores] for predicting the composite of recurrent stroke or all-cause mortality at 90 days and at 3 years after hospitalization for first acute ischaemic stroke (AIS). Methods and results In adults hospitalized with AIS from January 2005 to November 2016, with follow-up until November 2019, we developed three ML models [random forest (RF), support vector machine (SVM), and extreme gradient boosting (XGBOOST)] and externally validated the iScore and THRIVE scores for predicting the composite outcomes after AIS hospitalization, using data from 721 patients and 90 potential predictor variables. At 90 days and 3 years, 11 and 34% of patients, respectively, reached the composite outcome. For the 90-day prediction, the area under the receiver operating characteristic curve (AUC) was 0.779 for RF, 0.771 for SVM, 0.772 for XGBOOST, 0.720 for iScore, and 0.664 for THRIVE. For 3-year prediction, the AUC was 0.743 for RF, 0.777 for SVM, 0.773 for XGBOOST, 0.710 for iScore, and 0.675 for THRIVE. Conclusion The study provided three ML-based predictive models that achieved good discrimination and clinical usefulness in outcome prediction after AIS and broadened the application of the iScore and THRIVE scoring system for long-term outcome prediction. Our findings warrant comparative analyses of ML and existing statistical method-based risk prediction tools for outcome prediction after AIS in new data sets.
Collapse
Affiliation(s)
- Daniel Axford
- Department of Information Technology, Mathematics and Statistics, College of Science, Health, Engineering and Education, Murdoch University, Murdoch, Australia
| | - Ferdous Sohel
- Department of Information Technology, Mathematics and Statistics, College of Science, Health, Engineering and Education, Murdoch University, Murdoch, Australia
| | - Vida Abedi
- Department of Public Health Science, Penn State College of Medicine, Hershey, PA, USA
| | - Ye Zhu
- Robert D. and Patricia E. Kern Centre for the Science of Healthcare Delivery, Mayo Clinic, Rochester, MN, USA
| | - Ramin Zand
- Neuroscience Institute, Geisinger Health System, 100 North Academy Ave, Danville, PA 17822, USA
- Neuroscience Institute, The Pennsylvania State University, Hershey, PA 17033, USA
| | - Ebrahim Barkoudah
- Internal Medicine/Hospital Medicine, Brigham and Women’s Hospital, Harvard University, Boston, MA, USA
| | - Troy Krupica
- Internal Medicine/Hospital Medicine, West Virginial University, Morgantown, WV, USA
| | - Kingsley Iheasirim
- Internal Medicine/Hospital Internal Medicine, Mayo Clinic Health System, Mankato, MN, USA
| | - Umesh M Sharma
- Hospital Internal Medicine, Mayo Clinic, Phoenix, AZ, USA
| | - Sagar B Dugani
- Hospital Internal Medicine, Mayo Clinic, Rochester, MN, USA
| | | | - Sumit Bhagra
- Endocrinology, Diabetes and Metabolism, Mayo Clinic Health System, Austin, MN, USA
| | - Mohammad H Murad
- Division of Public Health, Infectious Diseases, and Occupational Medicine, Mayo Clinic, Rochester, MN, USA
| | - Gustavo Saposnik
- Stroke Outcomes and Decision Neuroscience Research Unit, Division of Neurology, Department of Medicine and Li Ka Shing Knowledge Institute, St.Michael’s Hospital, University of Toronto, Toronto, Ontario, Canada
| | - Mohammed Yousufuddin
- Hospital Internal Medicine, Mayo Clinic Health System, 1000 1st Drive NW, Austin, MN 55912, USA
| |
Collapse
|
8
|
Kolasa K, Admassu B, Hołownia-Voloskova M, Kędzior KJ, Poirrier JE, Perni S. Systematic reviews of machine learning in healthcare: a literature review. Expert Rev Pharmacoecon Outcomes Res 2024; 24:63-115. [PMID: 37955147 DOI: 10.1080/14737167.2023.2279107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Accepted: 10/31/2023] [Indexed: 11/14/2023]
Abstract
INTRODUCTION The increasing availability of data and computing power has made machine learning (ML) a viable approach to faster, more efficient healthcare delivery. METHODS A systematic literature review (SLR) of published SLRs evaluating ML applications in healthcare settings published between1 January 2010 and 27 March 2023 was conducted. RESULTS In total 220 SLRs covering 10,462 ML algorithms were reviewed. The main application of AI in medicine related to the clinical prediction and disease prognosis in oncology and neurology with the use of imaging data. Accuracy, specificity, and sensitivity were provided in 56%, 28%, and 25% SLRs respectively. Internal and external validation was reported in 53% and less than 1% of the cases respectively. The most common modeling approach was neural networks (2,454 ML algorithms), followed by support vector machine and random forest/decision trees (1,578 and 1,522 ML algorithms, respectively). EXPERT OPINION The review indicated considerable reporting gaps in terms of the ML's performance, both internal and external validation. Greater accessibility to healthcare data for developers can ensure the faster adoption of ML algorithms into clinical practice.
Collapse
Affiliation(s)
- Katarzyna Kolasa
- Division of Health Economics and Healthcare Management, Kozminski University, Warsaw, Poland
| | - Bisrat Admassu
- Division of Health Economics and Healthcare Management, Kozminski University, Warsaw, Poland
| | | | | | | | | |
Collapse
|
9
|
Che Nawi CMNH, Mohd Hairon S, Wan Yahya WNN, Wan Zaidi WA, Musa KI. Machine Learning Models for Predicting Stroke Mortality in Malaysia: An Application and Comparative Analysis. Cureus 2023; 15:e50426. [PMID: 38222138 PMCID: PMC10784718 DOI: 10.7759/cureus.50426] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/11/2023] [Indexed: 01/16/2024] Open
Abstract
Background Stroke is a significant public health concern characterized by increasing mortality and morbidity. Accurate long-term outcome prediction for acute stroke patients, particularly stroke mortality, is vital for clinical decision-making and prognostic management. This study aimed to develop and compare various prognostic models for stroke mortality prediction. Methods In a retrospective cohort study from January 2016 to December 2021, we collected data from patients diagnosed with acute stroke from five selected hospitals. Data contained variables on demographics, comorbidities, and interventions retrieved from medical records. The cohort comprised 950 patients with 20 features. Outcomes (censored vs. death) were determined by linking data with the Malaysian National Mortality Registry. We employed three common survival modeling approaches, the Cox proportional hazard regression (Cox), support vector machine (SVM), and random survival forest (RSF), while enhancing the Cox model with Elastic Net (Cox-EN) for feature selection. Models were compared using the concordance index (C-index), time-dependent area under the curve (AUC), and discrimination index (D-index), with calibration assessed by the Brier score. Results The support vector machine (SVM) model excelled among the four, with three-month, one-year, and three-year time-dependent AUC values of 0.842, 0.846, and 0.791; a D-index of 5.31 (95% CI: 3.86, 7.30); and a C-index of 0.803 (95% CI: 0.758, 0.847). All models exhibited robust calibration, with three-month, one-year, and three-year Brier scores ranging from 0.103 to 0.220, all below 0.25. Conclusion The support vector machine (SVM) model demonstrated superior discriminative performance, suggesting its efficacy in developing prognostic models for stroke mortality. This study enhances stroke mortality prediction and supports clinical decision-making, emphasizing the utility of the support vector machine method.
Collapse
Affiliation(s)
| | - Suhaily Mohd Hairon
- Department of Community Medicine, School of Medical Sciences, Universiti Sains Malaysia, Kubang Kerian, MYS
| | - Wan Nur Nafisah Wan Yahya
- Department of Internal Medicine, Universiti Kebangsaan Malaysia Medical Centre (UKMMC), Kuala Lumpur, MYS
| | - Wan Asyraf Wan Zaidi
- Department of Internal Medicine, Universiti Kebangsaan Malaysia Medical Centre (UKMMC), Kuala Lumpur, MYS
| | - Kamarul Imran Musa
- Department of Community Medicine, School of Medical Sciences, Universiti Sains Malaysia, Kota Bharu, MYS
| |
Collapse
|
10
|
Wang W, Otieno JA, Eriksson M, Wolfe CD, Curcin V, Bray BD. Developing and externally validating a machine learning risk prediction model for 30-day mortality after stroke using national stroke registers in the UK and Sweden. BMJ Open 2023; 13:e069811. [PMID: 37968001 PMCID: PMC10660948 DOI: 10.1136/bmjopen-2022-069811] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Accepted: 07/27/2023] [Indexed: 11/17/2023] Open
Abstract
OBJECTIVES We aimed to develop and externally validate a generalisable risk prediction model for 30-day stroke mortality suitable for supporting quality improvement analytics in stroke care using large nationwide stroke registers in the UK and Sweden. DESIGN Registry-based cohort study. SETTING Stroke registries including the Sentinel Stroke National Audit Programme (SSNAP) in England, Wales and Northern Ireland (2013-2019) and the national Swedish stroke register (Riksstroke 2015-2020). PARTICIPANTS AND METHODS Data from SSNAP were used for developing and temporally validating the model, and data from Riksstroke were used for external validation. Models were developed with the variables available in both registries using logistic regression (LR), LR with elastic net and interaction terms and eXtreme Gradient Boosting (XGBoost). Performances were evaluated with discrimination, calibration and decision curves. OUTCOME MEASURES The primary outcome was all-cause 30-day in-hospital mortality after stroke. RESULTS In total, 488 497 patients who had a stroke with 12.4% 30-day in-hospital mortality were used for developing and temporally validating the model in the UK. A total of 128 360 patients who had a stroke with 10.8% 30-day in-hospital mortality and 13.1% all mortality were used for external validation in Sweden. In the SSNAP temporal validation set, the final XGBoost model achieved the highest area under the receiver operating characteristic curve (AUC) (0.852 (95% CI 0.848 to 0.855)) and was well calibrated. The performances on the external validation in Riksstroke were as good and achieved AUC at 0.861 (95% CI 0.858 to 0.865) for in-hospital mortality. For Riksstroke, the models slightly overestimated the risk for in-hospital mortality, while they were better calibrated at the risk for all mortality. CONCLUSION The risk prediction model was accurate and externally validated using high quality registry data. This is potentially suitable to be deployed as part of quality improvement analytics in stroke care to enable the fair comparison of stroke mortality outcomes across hospitals and health systems across countries.
Collapse
Affiliation(s)
- Wenjuan Wang
- Department of Population Health Sciences, King's College London, London, UK
| | | | | | - Charles D Wolfe
- Department of Population Health Sciences, King's College London, London, UK
| | - Vasa Curcin
- Department of Population Health Sciences, King's College London, London, UK
| | - Benjamin D Bray
- Department of Population Health Sciences, King's College London, London, UK
| |
Collapse
|
11
|
Ajuwon BI, Awotundun ON, Richardson A, Roper K, Sheel M, Rahman N, Salako A, Lidbury BA. Machine learning prediction models for clinical management of blood-borne viral infections: a systematic review of current applications and future impact. Int J Med Inform 2023; 179:105244. [PMID: 37820561 DOI: 10.1016/j.ijmedinf.2023.105244] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Revised: 09/08/2023] [Accepted: 10/03/2023] [Indexed: 10/13/2023]
Abstract
BACKGROUND Machine learning (ML) prediction models to support clinical management of blood-borne viral infections are becoming increasingly abundant in medical literature, with a number of competing models being developed for the same outcome or target population. However, evidence on the quality of these ML prediction models are limited. OBJECTIVE This study aimed to evaluate the development and quality of reporting of ML prediction models that could facilitate timely clinical management of blood-borne viral infections. METHODS We conducted narrative evidence synthesis following the synthesis without meta-analysis guidelines. We searched PubMed and Cochrane Central Register of Controlled Trials for all studies applying ML models for predicting clinical outcomes associated with hepatitis B virus (HBV), human immunodeficiency virus (HIV), or hepatitis C virus (HCV). RESULTS We found 33 unique ML prediction models aiming to support clinical decision making. Overall, 12 (36.4%) focused on HBV, 10 (30.3%) on HCV, 10 on HIV (30.3%) and two (6.1%) on co-infection. Among these, six (18.2%) addressed the diagnosis of infection, 16 (48.5%) the prognosis of infection, eight (24.2%) the prediction of treatment response, two (6.1%) progression through a cascade of care, and one (3.03%) focused on the choice of antiretroviral therapy (ART). Nineteen prediction models (57.6%) were developed using data from high-income countries. Evaluation of prediction models was limited to measures of performance. Detailed information on software code accessibility was often missing. Independent validation on new datasets and/or in other institutions was rarely done. CONCLUSION Promising approaches for ML prediction models in blood-borne viral infections were identified, but the lack of robust validation, interpretability/explainability, and poor quality of reporting hampered their clinical relevance. Our findings highlight important considerations that can inform standard reporting guidelines for ML prediction models in the future (e.g., TRIPOD-AI), and provides critical data to inform robust evaluation of the models.
Collapse
Affiliation(s)
- Busayo I Ajuwon
- National Centre for Epidemiology and Population Health, ANU College of Health and Medicine, The Australian National University, Acton, Australian Capital Territory, Australia; Department of Biosciences and Biotechnology, Faculty of Pure and Applied Sciences, Kwara State University, Malete, Nigeria.
| | - Oluwatosin N Awotundun
- Department of Epidemiology, Biostatistics and Occupational Health, Faculty of Medicine and Health Sciences, McGill University, Montreal, Canada
| | - Alice Richardson
- Statistical Support Network, The Australian National University, Acton, ACT, Australia
| | - Katrina Roper
- National Centre for Epidemiology and Population Health, ANU College of Health and Medicine, The Australian National University, Acton, Australian Capital Territory, Australia
| | - Meru Sheel
- Sydney School of Public Health, Faculty of Medicine and Health, The University of Sydney, New South Wales, Australia
| | - Nurudeen Rahman
- Department of Medical Parasitology and Infection Biology, Swiss Tropical and Public Health Institute, Basel, Switzerland
| | - Abideen Salako
- Department of Clinical Sciences, Nigerian Institute of Medical Research, Yaba, Lagos State, Nigeria
| | - Brett A Lidbury
- National Centre for Epidemiology and Population Health, ANU College of Health and Medicine, The Australian National University, Acton, Australian Capital Territory, Australia
| |
Collapse
|
12
|
Jo H, Kim C, Gwon D, Lee J, Lee J, Park KM, Park S. Combining clinical and imaging data for predicting functional outcomes after acute ischemic stroke: an automated machine learning approach. Sci Rep 2023; 13:16926. [PMID: 37805568 PMCID: PMC10560215 DOI: 10.1038/s41598-023-44201-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Accepted: 10/04/2023] [Indexed: 10/09/2023] Open
Abstract
This study aimed to develop and validate an automated machine learning (ML) system that predicts 3-month functional outcomes in acute ischemic stroke (AIS) patients by combining clinical and neuroimaging features. Functional outcomes were categorized as unfavorable (modified Rankin Scale ≥ 3) or not. A clinical model employing optimal clinical features (Model_A), a convolutional neural network model incorporating imaging data (Model_B), and an integrated model combining both imaging and clinical features (Model_C) were developed and tested to predict unfavorable outcomes. The developed models were compared with each other and with traditional risk-scoring models. The dataset comprised 4147 patients from a multicenter stroke registry, with 1268 (30.6%) experiencing unfavorable outcomes. Age, initial NIHSS, and early neurologic deterioration were identified as the most important clinical features. The ML model prediction achieved an area under the curves of 0.757 (95% CI 0.726-0.789) for Model_A, 0.725 (95% CI 0.693-0.755) for Model_B, and 0.786 (95% CI 0.757-0.814) for Model_C in the test set. The integrated models outperformed traditional risk-scoring models by 0.21 (95% CI 0.16-0.25) for HIAT and 0.15 (95% CI 0.11-0.19) for THRIVE. In conclusion, the integrated ML system enhanced stroke outcome prediction by combining imaging data and clinical features, outperforming traditional risk-scoring models.
Collapse
Affiliation(s)
- Hongju Jo
- Department of CGMS Sensor, Sensor R&D Center, i-SENS, Seoul, Republic of Korea
| | - Changi Kim
- Integrated Major in Innovative Medical Science, Seoul National University Graduate School, Seoul, Republic of Korea
| | - Dowan Gwon
- Department of Digital&Biohealth, Group of AI/DX Business, KT, Seoul, Republic of Korea
| | - Jaeho Lee
- Department of Preventive Medicine, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Joonwon Lee
- Department of Neurology, Haeundae Paik Hospital, Inje University College of Medicine, Haeundae-ro 875, Haeundae-gu, 48108, Busan, Republic of Korea
| | - Kang Min Park
- Department of Neurology, Haeundae Paik Hospital, Inje University College of Medicine, Haeundae-ro 875, Haeundae-gu, 48108, Busan, Republic of Korea
| | - Seongho Park
- Department of Neurology, Haeundae Paik Hospital, Inje University College of Medicine, Haeundae-ro 875, Haeundae-gu, 48108, Busan, Republic of Korea.
| |
Collapse
|
13
|
Abdulazeem H, Whitelaw S, Schauberger G, Klug SJ. A systematic review of clinical health conditions predicted by machine learning diagnostic and prognostic models trained or validated using real-world primary health care data. PLoS One 2023; 18:e0274276. [PMID: 37682909 PMCID: PMC10491005 DOI: 10.1371/journal.pone.0274276] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2022] [Accepted: 08/29/2023] [Indexed: 09/10/2023] Open
Abstract
With the advances in technology and data science, machine learning (ML) is being rapidly adopted by the health care sector. However, there is a lack of literature addressing the health conditions targeted by the ML prediction models within primary health care (PHC) to date. To fill this gap in knowledge, we conducted a systematic review following the PRISMA guidelines to identify health conditions targeted by ML in PHC. We searched the Cochrane Library, Web of Science, PubMed, Elsevier, BioRxiv, Association of Computing Machinery (ACM), and IEEE Xplore databases for studies published from January 1990 to January 2022. We included primary studies addressing ML diagnostic or prognostic predictive models that were supplied completely or partially by real-world PHC data. Studies selection, data extraction, and risk of bias assessment using the prediction model study risk of bias assessment tool were performed by two investigators. Health conditions were categorized according to international classification of diseases (ICD-10). Extracted data were analyzed quantitatively. We identified 106 studies investigating 42 health conditions. These studies included 207 ML prediction models supplied by the PHC data of 24.2 million participants from 19 countries. We found that 92.4% of the studies were retrospective and 77.3% of the studies reported diagnostic predictive ML models. A majority (76.4%) of all the studies were for models' development without conducting external validation. Risk of bias assessment revealed that 90.8% of the studies were of high or unclear risk of bias. The most frequently reported health conditions were diabetes mellitus (19.8%) and Alzheimer's disease (11.3%). Our study provides a summary on the presently available ML prediction models within PHC. We draw the attention of digital health policy makers, ML models developer, and health care professionals for more future interdisciplinary research collaboration in this regard.
Collapse
Affiliation(s)
- Hebatullah Abdulazeem
- Chair of Epidemiology, Department of Sport and Health Sciences, Technical University of Munich (TUM), Munich, Germany
| | - Sera Whitelaw
- Faculty of Medicine and Health Sciences, McGill University, Montreal, Quebec, Canada
| | - Gunther Schauberger
- Chair of Epidemiology, Department of Sport and Health Sciences, Technical University of Munich (TUM), Munich, Germany
| | - Stefanie J. Klug
- Chair of Epidemiology, Department of Sport and Health Sciences, Technical University of Munich (TUM), Munich, Germany
| |
Collapse
|
14
|
Che Nawi CMNH, Mohd Hairon S, Wan Yahya WNN, Wan Zaidi WA, Hassan MR, Musa KI. Machine Learning Application: A Bibliometric Analysis From a Half-Century of Research on Stroke. Cureus 2023; 15:e44142. [PMID: 37753006 PMCID: PMC10518640 DOI: 10.7759/cureus.44142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/25/2023] [Indexed: 09/28/2023] Open
Abstract
The quick advancement of digital technology through artificial intelligence has made it possible to deploy machine learning to predict stroke outcomes. Our aim is to examine the trend of machine learning applications in stroke-related research over the past 50 years. We used search terms stroke and machine learning to search for English versions of original and review articles and conference proceedings published over the past 50 years in Scopus and Web of Science databases. The Biblioshiny web application was utilized for the analysis. The trend of publication and prominent authors and journals were analyzed and identified. The collaborative network between countries was mapped, and a thematic map was used to monitor the authors' trending keywords. In total, 10,535 publications authored by 44,990 authors from 2,212 sources were retrieved. Two distinct clusters of collaborative network nodes were observed, with the United States serving as a connecting node. Three terms - deep learning, algorithms, and neural networks - are observed in the early stages of the emerging theme. Overall, international research collaborations, the establishment of global research initiatives, the development of computational science, and the availability of big data have facilitated the pervasive use of machine learning techniques in stroke research.
Collapse
Affiliation(s)
| | - Suhaily Mohd Hairon
- Department of Community Medicine, School of Medical Sciences, Universiti Sains Malaysia, Kubang Kerian, MYS
| | - Wan Nur Nafisah Wan Yahya
- Department of Internal Medicine/ Neurology, Universiti Kebangsaan Malaysia Medical Centre, Kuala Lumpur, MYS
| | - Wan Asyraf Wan Zaidi
- Department of Internal Medicine/ Neurology, Universiti Kebangsaan Malaysia Medical Centre, Kuala Lumpur, MYS
| | - Mohd Rohaizat Hassan
- Department of Community Health, Faculty of Medicine, National University of Malaysia, Kuala Lumpur, MYS
| | - Kamarul Imran Musa
- Department of Community Medicine, School of Medical Sciences, Universiti Sains Malaysia, Kubang Kerian, MYS
| |
Collapse
|
15
|
Miyazaki Y, Kawakami M, Kondo K, Tsujikawa M, Honaga K, Suzuki K, Tsuji T. Improvement of predictive accuracies of functional outcomes after subacute stroke inpatient rehabilitation by machine learning models. PLoS One 2023; 18:e0286269. [PMID: 37235575 DOI: 10.1371/journal.pone.0286269] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Accepted: 05/11/2023] [Indexed: 05/28/2023] Open
Abstract
OBJECTIVES Stepwise linear regression (SLR) is the most common approach to predicting activities of daily living at discharge with the Functional Independence Measure (FIM) in stroke patients, but noisy nonlinear clinical data decrease the predictive accuracies of SLR. Machine learning is gaining attention in the medical field for such nonlinear data. Previous studies reported that machine learning models, regression tree (RT), ensemble learning (EL), artificial neural networks (ANNs), support vector regression (SVR), and Gaussian process regression (GPR), are robust to such data and increase predictive accuracies. This study aimed to compare the predictive accuracies of SLR and these machine learning models for FIM scores in stroke patients. METHODS Subacute stroke patients (N = 1,046) who underwent inpatient rehabilitation participated in this study. Only patients' background characteristics and FIM scores at admission were used to build each predictive model of SLR, RT, EL, ANN, SVR, and GPR with 10-fold cross-validation. The coefficient of determination (R2) and root mean square error (RMSE) values were compared between the actual and predicted discharge FIM scores and FIM gain. RESULTS Machine learning models (R2 of RT = 0.75, EL = 0.78, ANN = 0.81, SVR = 0.80, GPR = 0.81) outperformed SLR (0.70) to predict discharge FIM motor scores. The predictive accuracies of machine learning methods for FIM total gain (R2 of RT = 0.48, EL = 0.51, ANN = 0.50, SVR = 0.51, GPR = 0.54) were also better than of SLR (0.22). CONCLUSIONS This study suggested that the machine learning models outperformed SLR for predicting FIM prognosis. The machine learning models used only patients' background characteristics and FIM scores at admission and more accurately predicted FIM gain than previous studies. ANN, SVR, and GPR outperformed RT and EL. GPR could have the best predictive accuracy for FIM prognosis.
Collapse
Affiliation(s)
- Yuta Miyazaki
- Department of Physical Rehabilitation, National Center Hospital, National Center of Neurology and Psychiatry, Tokyo, Japan
- Department of Rehabilitation Medicine, Tokyo Bay Rehabilitation Hospital, Chiba, Japan
- Department of Rehabilitation Medicine, Keio University School of Medicine, Tokyo, Japan
| | - Michiyuki Kawakami
- Department of Rehabilitation Medicine, Tokyo Bay Rehabilitation Hospital, Chiba, Japan
- Department of Rehabilitation Medicine, Keio University School of Medicine, Tokyo, Japan
| | - Kunitsugu Kondo
- Department of Rehabilitation Medicine, Tokyo Bay Rehabilitation Hospital, Chiba, Japan
- Department of Rehabilitation Medicine, Keio University School of Medicine, Tokyo, Japan
| | - Masahiro Tsujikawa
- Department of Rehabilitation Medicine, Tokyo Bay Rehabilitation Hospital, Chiba, Japan
- Department of Rehabilitation Medicine, Keio University School of Medicine, Tokyo, Japan
| | - Kaoru Honaga
- Department of Rehabilitation Medicine, Tokyo Bay Rehabilitation Hospital, Chiba, Japan
- Department of Rehabilitation Medicine, Juntendo University Graduate School of Medicine, Tokyo, Japan
| | - Kanjiro Suzuki
- Department of Rehabilitation Medicine, Waseda Clinic, Miyazaki, Japan
| | - Tetsuya Tsuji
- Department of Rehabilitation Medicine, Keio University School of Medicine, Tokyo, Japan
| |
Collapse
|
16
|
Dhiman P, Ma J, Andaur Navarro CL, Speich B, Bullock G, Damen JAA, Hooft L, Kirtley S, Riley RD, Van Calster B, Moons KGM, Collins GS. Overinterpretation of findings in machine learning prediction model studies in oncology: a systematic review. J Clin Epidemiol 2023; 157:120-133. [PMID: 36935090 DOI: 10.1016/j.jclinepi.2023.03.012] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Revised: 03/03/2023] [Accepted: 03/14/2023] [Indexed: 03/19/2023]
Abstract
OBJECTIVES In biomedical research, spin is the overinterpretation of findings, and it is a growing concern. To date, the presence of spin has not been evaluated in prognostic model research in oncology, including studies developing and validating models for individualized risk prediction. STUDY DESIGN AND SETTING We conducted a systematic review, searching MEDLINE and EMBASE for oncology-related studies that developed and validated a prognostic model using machine learning published between 1st January, 2019, and 5th September, 2019. We used existing spin frameworks and described areas of highly suggestive spin practices. RESULTS We included 62 publications (including 152 developed models; 37 validated models). Reporting was inconsistent between methods and the results in 27% of studies due to additional analysis and selective reporting. Thirty-two studies (out of 36 applicable studies) reported comparisons between developed models in their discussion and predominantly used discrimination measures to support their claims (78%). Thirty-five studies (56%) used an overly strong or leading word in their title, abstract, results, discussion, or conclusion. CONCLUSION The potential for spin needs to be considered when reading, interpreting, and using studies that developed and validated prognostic models in oncology. Researchers should carefully report their prognostic model research using words that reflect their actual results and strength of evidence.
Collapse
Affiliation(s)
- Paula Dhiman
- Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK; NIHR Oxford Biomedical Research Centre, Oxford University Hospitals NHS Foundation Trust, Oxford, UK.
| | - Jie Ma
- Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK
| | - Constanza L Andaur Navarro
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Benjamin Speich
- Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK; Meta-Research Centre, Department of Clinical Research, University Hospital Basel, University of Basel, Basel, Switzerland
| | - Garrett Bullock
- Nuffield Department of Orthopaedics, Rheumatology, and Musculoskeletal Sciences, University of Oxford, Oxford, UK
| | - Johanna A A Damen
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Lotty Hooft
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Shona Kirtley
- Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK
| | - Richard D Riley
- Centre for Prognosis Research, School of Medicine, Keele University, Staffordshire, UK, ST5 5BG
| | - Ben Van Calster
- Department of Development and Regeneration, KU Leuven, Leuven, Belgium; Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, the Netherlands; EPI-centre, KU Leuven, Leuven, Belgium
| | - Karel G M Moons
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Gary S Collins
- Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK; NIHR Oxford Biomedical Research Centre, Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| |
Collapse
|
17
|
Lin CH, Kuo YW, Huang YC, Lee M, Huang YW, Kuo CF, Lee JD. Development and Validation of a Novel Score for Predicting Long-Term Mortality after an Acute Ischemic Stroke. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2023; 20:3043. [PMID: 36833741 PMCID: PMC9961287 DOI: 10.3390/ijerph20043043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/12/2022] [Revised: 02/04/2023] [Accepted: 02/06/2023] [Indexed: 06/18/2023]
Abstract
BACKGROUND Long-term mortality prediction can guide feasible discharge care plans and coordinate appropriate rehabilitation services. We aimed to develop and validate a prediction model to identify patients at risk of mortality after acute ischemic stroke (AIS). METHODS The primary outcome was all-cause mortality, and the secondary outcome was cardiovascular death. This study included 21,463 patients with AIS. Three risk prediction models were developed and evaluated: a penalized Cox model, a random survival forest model, and a DeepSurv model. A simplified risk scoring system, called the C-HAND (history of Cancer before admission, Heart rate, Age, eNIHSS, and Dyslipidemia) score, was created based on regression coefficients in the multivariate Cox model for both study outcomes. RESULTS All experimental models achieved a concordance index of 0.8, with no significant difference in predicting poststroke long-term mortality. The C-HAND score exhibited reasonable discriminative ability for both study outcomes, with concordance indices of 0.775 and 0.798. CONCLUSIONS Reliable prediction models for long-term poststroke mortality were developed using information routinely available to clinicians during hospitalization.
Collapse
Affiliation(s)
- Ching-Heng Lin
- Center for Artificial Intelligence in Medicine, Chang Gung Memorial Hospital, Taoyuan 333, Taiwan
- Bachelor Program in Artificial Intelligence, Chang Gung University, Taoyuan 333, Taiwan
| | - Ya-Wen Kuo
- Department of Nursing, Chang Gung University of Science and Technology, Chiayi Campus, Chiayi 613, Taiwan
- Associate Research Fellow, Chang Gung Memorial Hospital, Chiayi 613, Taiwan
| | - Yen-Chu Huang
- Department of Neurology, Chiayi Chang Gung Memorial Hospital, Chiayi 613, Taiwan
- College of Medicine, Chang Gung University, Taoyuan 333, Taiwan
| | - Meng Lee
- Department of Neurology, Chiayi Chang Gung Memorial Hospital, Chiayi 613, Taiwan
- College of Medicine, Chang Gung University, Taoyuan 333, Taiwan
| | - Yi-Wei Huang
- Center for Artificial Intelligence in Medicine, Chang Gung Memorial Hospital, Taoyuan 333, Taiwan
| | - Chang-Fu Kuo
- Center for Artificial Intelligence in Medicine, Chang Gung Memorial Hospital, Taoyuan 333, Taiwan
- Division of Rheumatology, Allergy, and Immunology, Chang Gung Memorial Hospital, Taoyuan 333, Taiwan
| | - Jiann-Der Lee
- Department of Neurology, Chiayi Chang Gung Memorial Hospital, Chiayi 613, Taiwan
- College of Medicine, Chang Gung University, Taoyuan 333, Taiwan
| |
Collapse
|
18
|
Hossain D, Scott SH, Cluff T, Dukelow SP. The use of machine learning and deep learning techniques to assess proprioceptive impairments of the upper limb after stroke. J Neuroeng Rehabil 2023; 20:15. [PMID: 36707846 PMCID: PMC9881388 DOI: 10.1186/s12984-023-01140-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2022] [Accepted: 01/18/2023] [Indexed: 01/28/2023] Open
Abstract
BACKGROUND Robots can generate rich kinematic datasets that have the potential to provide far more insight into impairments than standard clinical ordinal scales. Determining how to define the presence or absence of impairment in individuals using kinematic data, however, can be challenging. Machine learning techniques offer a potential solution to this problem. In the present manuscript we examine proprioception in stroke survivors using a robotic arm position matching task. Proprioception is impaired in 50-60% of stroke survivors and has been associated with poorer motor recovery and longer lengths of hospital stay. We present a simple cut-off score technique for individual kinematic parameters and an overall task score to determine impairment. We then compare the ability of different machine learning (ML) techniques and the above-mentioned task score to correctly classify individuals with or without stroke based on kinematic data. METHODS Participants performed an Arm Position Matching (APM) task in an exoskeleton robot. The task produced 12 kinematic parameters that quantify multiple attributes of position sense. We first quantified impairment in individual parameters and an overall task score by determining if participants with stroke fell outside of the 95% cut-off score of control (normative) values. Then, we applied five machine learning algorithms (i.e., Logistic Regression, Decision Tree, Random Forest, Random Forest with Hyperparameters Tuning, and Support Vector Machine), and a deep learning algorithm (i.e., Deep Neural Network) to classify individual participants as to whether or not they had a stroke based only on kinematic parameters using a tenfold cross-validation approach. RESULTS We recruited 429 participants with neuroimaging-confirmed stroke (< 35 days post-stroke) and 465 healthy controls. Depending on the APM parameter, we observed that 10.9-48.4% of stroke participants were impaired, while 44% were impaired based on their overall task score. The mean performance metrics of machine learning and deep learning models were: accuracy 82.4%, precision 85.6%, recall 76.5%, and F1 score 80.6%. All machine learning and deep learning models displayed similar classification accuracy; however, the Random Forest model had the highest numerical accuracy (83%). Our models showed higher sensitivity and specificity (AUC = 0.89) in classifying individual participants than the overall task score (AUC = 0.85) based on their performance in the APM task. We also found that variability was the most important feature in classifying performance in the APM task. CONCLUSION Our ML models displayed similar classification performance. ML models were able to integrate more kinematic information and relationships between variables into decision making and displayed better classification performance than the overall task score. ML may help to provide insight into individual kinematic features that have previously been overlooked with respect to clinical importance.
Collapse
Affiliation(s)
- Delowar Hossain
- grid.22072.350000 0004 1936 7697Department of Clinical Neuroscience, Cumming School of Medicine, University of Calgary, Calgary, AB Canada
| | - Stephen H. Scott
- grid.410356.50000 0004 1936 8331Department of Biomedical and Molecular Sciences, Queen’s University, Kingston, ON Canada
| | - Tyler Cluff
- grid.22072.350000 0004 1936 7697Faculty of Kinesiology, University of Calgary, Calgary, AB Canada
| | - Sean P. Dukelow
- grid.22072.350000 0004 1936 7697Department of Clinical Neuroscience, Cumming School of Medicine, University of Calgary, Calgary, AB Canada
| |
Collapse
|
19
|
Fast L, Temuulen U, Villringer K, Kufner A, Ali HF, Siebert E, Huo S, Piper SK, Sperber PS, Liman T, Endres M, Ritter K. Machine learning-based prediction of clinical outcomes after first-ever ischemic stroke. Front Neurol 2023; 14:1114360. [PMID: 36895902 PMCID: PMC9990416 DOI: 10.3389/fneur.2023.1114360] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Accepted: 01/31/2023] [Indexed: 02/23/2023] Open
Abstract
Background Accurate prediction of clinical outcomes in individual patients following acute stroke is vital for healthcare providers to optimize treatment strategies and plan further patient care. Here, we use advanced machine learning (ML) techniques to systematically compare the prediction of functional recovery, cognitive function, depression, and mortality of first-ever ischemic stroke patients and to identify the leading prognostic factors. Methods We predicted clinical outcomes for 307 patients (151 females, 156 males; 68 ± 14 years) from the PROSpective Cohort with Incident Stroke Berlin study using 43 baseline features. Outcomes included modified Rankin Scale (mRS), Barthel Index (BI), Mini-Mental State Examination (MMSE), Modified Telephone Interview for Cognitive Status (TICS-M), Center for Epidemiologic Studies Depression Scale (CES-D) and survival. The ML models included a Support Vector Machine with a linear kernel and a radial basis function kernel as well as a Gradient Boosting Classifier based on repeated 5-fold nested cross-validation. The leading prognostic features were identified using Shapley additive explanations. Results The ML models achieved significant prediction performance for mRS at patient discharge and after 1 year, BI and MMSE at patient discharge, TICS-M after 1 and 3 years and CES-D after 1 year. Additionally, we showed that National Institutes of Health Stroke Scale (NIHSS) was the top predictor for most functional recovery outcomes as well as education for cognitive function and depression. Conclusion Our machine learning analysis successfully demonstrated the ability to predict clinical outcomes after first-ever ischemic stroke and identified the leading prognostic factors that contribute to this prediction.
Collapse
Affiliation(s)
- Lea Fast
- Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Department of Psychiatry and Psychotherapy, Berlin, Germany
| | - Uchralt Temuulen
- Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Center for Stroke Research Berlin (CSB), Berlin, Germany
| | - Kersten Villringer
- Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Center for Stroke Research Berlin (CSB), Berlin, Germany
| | - Anna Kufner
- Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Center for Stroke Research Berlin (CSB), Berlin, Germany.,Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Berlin, Germany.,Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Department of Neurology with Experimental Neurology, Berlin, Germany
| | - Huma Fatima Ali
- Berlin School of Mind and Brain, Humboldt-Universität zu Berlin, Berlin, Germany
| | - Eberhard Siebert
- Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Department of Neuroradiology, Berlin, Germany
| | - Shufan Huo
- Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Center for Stroke Research Berlin (CSB), Berlin, Germany.,Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Department of Neurology with Experimental Neurology, Berlin, Germany.,German Center for Cardiovascular Research (Deutsches Zentrum für Herz-Kreislauferkrankungen, DZHK), Partner Site Berlin, Berlin, Germany
| | - Sophie K Piper
- Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Berlin, Germany.,Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Institute of Biometry and Clinical Epidemiology, Berlin, Germany.,Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Institute of Medical Informatics, Berlin, Germany
| | - Pia Sophie Sperber
- Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Center for Stroke Research Berlin (CSB), Berlin, Germany.,Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, NeuroCure Cluster of Excellence, NeuroCure Clinical Research Center (NCRC), Berlin, Germany.,Experimental and Clinical Research Center, A Cooperation Between the Max Delbrück Center for Molecular Medicine in the Helmholtz Association and Charité - Universitätsmedizin Berlin, Berlin, Germany.,Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), Berlin, Germany
| | - Thomas Liman
- Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Center for Stroke Research Berlin (CSB), Berlin, Germany.,German Center for Cardiovascular Research (Deutsches Zentrum für Herz-Kreislauferkrankungen, DZHK), Partner Site Berlin, Berlin, Germany.,German Center for Neurodegenerative Diseases (Deutsches Zentrum für Neurodegenerative Erkrankungen, DZNE), Partner Site Berlin, Berlin, Germany.,Department of Neurology, Evangelical Hospital Oldenburg, Carl von Ossietzky-University, Oldenburg, Germany
| | - Matthias Endres
- Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Center for Stroke Research Berlin (CSB), Berlin, Germany.,Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Berlin, Germany.,Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Department of Neurology with Experimental Neurology, Berlin, Germany.,German Center for Cardiovascular Research (Deutsches Zentrum für Herz-Kreislauferkrankungen, DZHK), Partner Site Berlin, Berlin, Germany.,Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, NeuroCure Cluster of Excellence, NeuroCure Clinical Research Center (NCRC), Berlin, Germany.,German Center for Neurodegenerative Diseases (Deutsches Zentrum für Neurodegenerative Erkrankungen, DZNE), Partner Site Berlin, Berlin, Germany
| | - Kerstin Ritter
- Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Department of Psychiatry and Psychotherapy, Berlin, Germany.,Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Bernstein Center for Computational Neuroscience (BCCN), Berlin, Germany
| |
Collapse
|
20
|
Montazeri M, Montazeri M, Bahaadinbeigy K, Montazeri M, Afraz A. Application of machine learning methods in predicting schizophrenia and bipolar disorders: A systematic review. Health Sci Rep 2022; 6:e962. [PMID: 36589632 PMCID: PMC9795991 DOI: 10.1002/hsr2.962] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2021] [Revised: 11/10/2022] [Accepted: 11/15/2022] [Indexed: 12/29/2022] Open
Abstract
Background and Aim Schizophrenia and bipolar disorder (BD) are critical and high-risk inherited mental disorders with debilitating symptoms. Worldwide, 3% of the population suffers from these disorders. The mortality rate of these patients is higher compared to other people. Current procedures cannot effectively diagnose these disorders because it takes an average of 10 years from the onset of the first symptoms to the definitive diagnosis of the disease. Machine learning (ML) techniques are used to meet this need. This study aimed to summarize information on the use of ML techniques for predicting schizophrenia and BD to help early and timely diagnosis of the disease. Methods A systematic literature search included articles published until January 19, 2020 in 3 databases. Two reviewers independently assessed original papers to determine eligibility for inclusion in this review. PRISMA guidelines were followed to conduct the study, and the Prediction Model Risk of Bias Assessment Tool (PROBAST) to assess included papers. Results In this review, 1243 papers were retrieved through database searches, of which 15 papers were included based on full-text assessment. ML techniques were used to predict schizophrenia and BDs. The main algorithms applied were support vector machine (SVM) (10 studies), random forests (RF) (5 studies), and gradient boosting (GB) (3 studies). Input and output characteristics were very diverse and have been kept to enable future research. RFs algorithms demonstrated significantly higher accuracy and sensitivity than SVM and GB. GB demonstrated significantly higher specificity than SVM and RF. We found no significant difference between RF and SVM in terms of specificity. Conclusion ML can precisely predict results and assist in making clinical decisions-concerning schizophrenia and BD. RF often performed better than other algorithms in supervised learning tasks. This study identified gaps in the literature and opportunities for future psychological ML research.
Collapse
Affiliation(s)
- Mahdieh Montazeri
- Department of Health Information Sciences, Faculty of Management and Medical Information SciencesKerman University of Medical SciencesKermanIran,Medical Informatics Research Center, Institute for Futures Studies in HealthKerman University of Medical SciencesKermanIran
| | - Mitra Montazeri
- Medical Informatics Research Center, Institute for Futures Studies in HealthKerman University of Medical SciencesKermanIran
| | - Kambiz Bahaadinbeigy
- Medical Informatics Research Center, Institute for Futures Studies in HealthKerman University of Medical SciencesKermanIran
| | - Mohadeseh Montazeri
- Department of Computer, Faculty of FatimahKerman Branch Technical and Vocational UniversityKermanIran
| | - Ali Afraz
- Medical Informatics Research Center, Institute for Futures Studies in HealthKerman University of Medical SciencesKermanIran
| |
Collapse
|
21
|
Reproducibility of artificial intelligence models in computed tomography of the head: a quantitative analysis. Insights Imaging 2022; 13:173. [PMID: 36303079 PMCID: PMC9613832 DOI: 10.1186/s13244-022-01311-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2022] [Accepted: 10/02/2022] [Indexed: 11/10/2022] Open
Abstract
When developing artificial intelligence (AI) software for applications in radiology, the underlying research must be transferable to other real-world problems. To verify to what degree this is true, we reviewed research on AI algorithms for computed tomography of the head. A systematic review was conducted according to the preferred reporting items for systematic reviews and meta-analyses. We identified 83 articles and analyzed them in terms of transparency of data and code, pre-processing, type of algorithm, architecture, hyperparameter, performance measure, and balancing of dataset in relation to epidemiology. We also classified all articles by their main functionality (classification, detection, segmentation, prediction, triage, image reconstruction, image registration, fusion of imaging modalities). We found that only a minority of authors provided open source code (10.15%, n 0 7), making the replication of results difficult. Convolutional neural networks were predominantly used (32.61%, n = 15), whereas hyperparameters were less frequently reported (32.61%, n = 15). Data sets were mostly from single center sources (84.05%, n = 58), increasing the susceptibility of the models to bias, which increases the error rate of the models. The prevalence of brain lesions in the training (0.49 ± 0.30) and testing (0.45 ± 0.29) datasets differed from real-world epidemiology (0.21 ± 0.28), which may overestimate performances. This review highlights the need for open source code, external validation, and consideration of disease prevalence.
Collapse
|
22
|
Babatunde AO, Togunwa TO, Awosiku O, Siddiqui MF, Rabiu AT, Akintola AA, Dauda BJ, Aborode AT. Internet of Things, Machine Learning, and Blockchain Technology: Emerging technologies revolutionizing Universal Health Coverage. Front Public Health 2022; 10:1024203. [PMID: 36353272 PMCID: PMC9638084 DOI: 10.3389/fpubh.2022.1024203] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2022] [Accepted: 10/11/2022] [Indexed: 01/28/2023] Open
Affiliation(s)
- Abdulhammed Opeyemi Babatunde
- Department of Medicine and Surgery, Faculty of Clinical Sciences, College of Medicine, University of Ibadan, Ibadan, Oyo, Nigeria,Healthy Africans Platform, Research and Development, Ibadan, Nigeria,Standing Committee on Medical Education and Research, Federation of African Medical Students' Associations (FAMSA), Ibadan, Nigeria,*Correspondence: Abdulhammed Opeyemi Babatunde
| | - Taofeeq Oluwatosin Togunwa
- Department of Medicine and Surgery, Faculty of Clinical Sciences, College of Medicine, University of Ibadan, Ibadan, Oyo, Nigeria
| | | | | | | | - Abdulqudus Abimbola Akintola
- Department of Medicine and Surgery, Faculty of Clinical Sciences, College of Medicine, University of Ibadan, Ibadan, Oyo, Nigeria
| | - Babatunde Jamiu Dauda
- Department of Medicine and Surgery, Faculty of Clinical Sciences, College of Medicine, University of Ibadan, Ibadan, Oyo, Nigeria
| | | |
Collapse
|
23
|
Kokkotis C, Giarmatzis G, Giannakou E, Moustakidis S, Tsatalas T, Tsiptsios D, Vadikolias K, Aggelousis N. An Explainable Machine Learning Pipeline for Stroke Prediction on Imbalanced Data. Diagnostics (Basel) 2022; 12:diagnostics12102392. [PMID: 36292081 PMCID: PMC9600473 DOI: 10.3390/diagnostics12102392] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Revised: 09/26/2022] [Accepted: 09/27/2022] [Indexed: 11/16/2022] Open
Abstract
Stroke is an acute neurological dysfunction attributed to a focal injury of the central nervous system due to reduced blood flow to the brain. Nowadays, stroke is a global threat associated with premature death and huge economic consequences. Hence, there is an urgency to model the effect of several risk factors on stroke occurrence, and artificial intelligence (AI) seems to be the appropriate tool. In the present study, we aimed to (i) develop reliable machine learning (ML) prediction models for stroke disease; (ii) cope with a typical severe class imbalance problem, which is posed due to the stroke patients’ class being significantly smaller than the healthy class; and (iii) interpret the model output for understanding the decision-making mechanism. The effectiveness of the proposed ML approach was investigated in a comparative analysis with six well-known classifiers with respect to metrics that are related to both generalization capability and prediction accuracy. The best overall false-negative rate was achieved by the Multi-Layer Perceptron (MLP) classifier (18.60%). Shapley Additive Explanations (SHAP) were employed to investigate the impact of the risk factors on the prediction output. The proposed AI method could lead to the creation of advanced and effective risk stratification strategies for each stroke patient, which would allow for timely diagnosis and the right treatments.
Collapse
Affiliation(s)
- Christos Kokkotis
- Department of Physical Education and Sport Science, Democritus University of Thrace, 69100 Komotini, Greece
| | - Georgios Giarmatzis
- Department of Physical Education and Sport Science, Democritus University of Thrace, 69100 Komotini, Greece
| | - Erasmia Giannakou
- Department of Physical Education and Sport Science, Democritus University of Thrace, 69100 Komotini, Greece
| | | | - Themistoklis Tsatalas
- Department of Physical Education and Sport Science, University of Thessaly, 38221 Trikala, Greece
| | - Dimitrios Tsiptsios
- Department of Neurology, School of Medicine, University Hospital of Alexandroupolis, Democritus University of Thrace, 68100 Alexandroupolis, Greece
| | - Konstantinos Vadikolias
- Department of Neurology, School of Medicine, University Hospital of Alexandroupolis, Democritus University of Thrace, 68100 Alexandroupolis, Greece
| | - Nikolaos Aggelousis
- Department of Physical Education and Sport Science, Democritus University of Thrace, 69100 Komotini, Greece
- Correspondence:
| |
Collapse
|
24
|
Martinez-Millana A, Saez-Saez A, Tornero-Costa R, Azzopardi-Muscat N, Traver V, Novillo-Ortiz D. Artificial intelligence and its impact on the domains of universal health coverage, health emergencies and health promotion: An overview of systematic reviews. Int J Med Inform 2022; 166:104855. [PMID: 35998421 PMCID: PMC9551134 DOI: 10.1016/j.ijmedinf.2022.104855] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Revised: 08/01/2022] [Accepted: 08/11/2022] [Indexed: 12/04/2022]
Abstract
An overview of systematic reviews on the application of AI including 129 studies. AI use is prominent in Universal Health Coverage, featuring image analysis in neoplasms. Half of the reviews did not evaluate validation procedures nor reporting guidelines. Risk of bias was only included un a third of the reviews. There is not sufficient evidence to transfer AI to actual healthcare delivery.
Background Artificial intelligence is fueling a new revolution in medicine and in the healthcare sector. Despite the growing evidence on the benefits of artificial intelligence there are several aspects that limit the measure of its impact in people’s health. It is necessary to assess the current status on the application of AI towards the improvement of people’s health in the domains defined by WHO’s Thirteenth General Programme of Work (GPW13) and the European Programme of Work (EPW), to inform about trends, gaps, opportunities, and challenges. Objective To perform a systematic overview of systematic reviews on the application of artificial intelligence in the people’s health domains as defined in the GPW13 and provide a comprehensive and updated map on the application specialties of artificial intelligence in terms of methodologies, algorithms, data sources, outcomes, predictors, performance, and methodological quality. Methods A systematic search in MEDLINE, EMBASE, Cochrane and IEEEXplore was conducted between January 2015 and June 2021 to collect systematic reviews using a combination of keywords related to the domains of universal health coverage, health emergencies protection, and better health and wellbeing as defined by the WHO’s PGW13 and EPW. Eligibility criteria was based on methodological quality and the inclusion of practical implementation of artificial intelligence. Records were classified and labeled using ICD-11 categories into the domains of the GPW13. Descriptors related to the area of implementation, type of modeling, data entities, outcomes and implementation on care delivery were extracted using a structured form and methodological aspects of the included reviews studies was assessed using the AMSTAR checklist. Results The search strategy resulted in the screening of 815 systematic reviews from which 203 were assessed for eligibility and 129 were included in the review. The most predominant domain for artificial intelligence applications was Universal Health Coverage (N = 98) followed by Health Emergencies (N = 16) and Better Health and Wellbeing (N = 15). Neoplasms area on Universal Health Coverage was the disease area featuring most of the applications (21.7 %, N = 28). The reviews featured analytics primarily over both public and private data sources (67.44 %, N = 87). The most used type of data was medical imaging (31.8 %, N = 41) and predictors based on regions of interest and clinical data. The most prominent subdomain of Artificial Intelligence was Machine Learning (43.4 %, N = 56), in which Support Vector Machine method was predominant (20.9 %, N = 27). Regarding the purpose, the application of Artificial Intelligence I is focused on the prediction of the diseases (36.4 %, N = 47). With respect to the validation, more than a half of the reviews (54.3 %, N = 70) did not report a validation procedure and, whenever available, the main performance indicator was the accuracy (28.7 %, N = 37). According to the methodological quality assessment, a third of the reviews (34.9 %, N = 45) implemented methods for analysis the risk of bias and the overall AMSTAR score below was 5 (4.01 ± 1.93) on all the included systematic reviews. Conclusion Artificial intelligence is being used for disease modelling, diagnose, classification and prediction in the three domains of GPW13. However, the evidence is often limited to laboratory and the level of adoption is largely unbalanced between ICD-11 categoriesand diseases. Data availability is a determinant factor on the developmental stage of artificial intelligence applications. Most of the reviewed studies show a poor methodological quality and are at high risk of bias, which limits the reproducibility of the results and the reliability of translating these applications to real clinical scenarios. The analyzed papers show results only in laboratory and testing scenarios and not in clinical trials nor case studies, limiting the supporting evidence to transfer artificial intelligence to actual care delivery.
Collapse
Affiliation(s)
- Antonio Martinez-Millana
- Instituto Universitario de Investigación de Aplicaciones de las Tecnologías de la Información y de las Comunicaciones Avanzadas (ITACA), Universitat Politècnica de València, Camino de Vera S/N, Valencia 46022, Spain
| | - Aida Saez-Saez
- Instituto Universitario de Investigación de Aplicaciones de las Tecnologías de la Información y de las Comunicaciones Avanzadas (ITACA), Universitat Politècnica de València, Camino de Vera S/N, Valencia 46022, Spain
| | - Roberto Tornero-Costa
- Instituto Universitario de Investigación de Aplicaciones de las Tecnologías de la Información y de las Comunicaciones Avanzadas (ITACA), Universitat Politècnica de València, Camino de Vera S/N, Valencia 46022, Spain
| | - Natasha Azzopardi-Muscat
- Division of Country Health Policies and Systems, World Health Organization, Regional Office for Europe, Copenhagen, Denmark
| | - Vicente Traver
- Instituto Universitario de Investigación de Aplicaciones de las Tecnologías de la Información y de las Comunicaciones Avanzadas (ITACA), Universitat Politècnica de València, Camino de Vera S/N, Valencia 46022, Spain
| | - David Novillo-Ortiz
- Division of Country Health Policies and Systems, World Health Organization, Regional Office for Europe, Copenhagen, Denmark.
| |
Collapse
|
25
|
Kothari R, Chiu C, Moukheiber M, Jehiro M, Bishara A, Lee C, Piracchio R, Celi LA. A descriptive appraisal of quality of reporting in a cohort of machine learning studies in anesthesiology. Anaesth Crit Care Pain Med 2022; 41:101126. [PMID: 35811037 DOI: 10.1016/j.accpm.2022.101126] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2022] [Revised: 05/18/2022] [Accepted: 05/19/2022] [Indexed: 12/13/2022]
Abstract
BACKGROUND The field of machine learning is being employed more and more in medicine. However, studies have shown that the quality of published studies frequently lacks completeness and adherence to published reporting guidelines. This assessment has not been done in the subspecialty of anesthesiology. METHODS We appraised the quality of reporting of a convenience sample of 67 peer-reviewed publications sourced from the scoping review by Hashimoto et al. Each publication was appraised on the presence of reporting elements (reporting compliance) selected from 4 peer-reviewed guidelines for reporting on machine learning studies. Results are described in several cross sections, including by section of manuscript (e.g. abstract, introduction, etc.), year of publication, impact factor of journal, and impact of publication. RESULTS On average, reporting compliance was 64% ± 13%. There was marked heterogeneity of reporting based on section of manuscript. There was a mild trend towards increased quality of reporting with increasing impact factor of journal of publication and increasing average number of citations per year since publication, and no trend regarding recency of publication. CONCLUSION The quality of reporting of machine learning studies in anesthesiology is on par with other fields, but can benefit from improvement, especially in presenting methodology, results, and discussion points, including interpretation of models and pitfalls therein. Clinicians in today's learning health systems will benefit from skills in appraisal of evidence. Several reporting guidelines have been released, and updates to mainstream guidelines are under development, which we hope will usher in improvement in reporting quality.
Collapse
Affiliation(s)
- Rishi Kothari
- Department of Anesthesia and Perioperative Care, University of California San Francisco, San Francisco, CA 4143, USA.
| | - Catherine Chiu
- Department of Anesthesia and Perioperative Care, University of California San Francisco, San Francisco, CA 4143, USA
| | - Mira Moukheiber
- Picower Institute for Learning & Memory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Matthew Jehiro
- Department of Biostatistics, State University of New York at Buffalo, Buffalo, NY 14260, USA
| | - Andrew Bishara
- Department of Anesthesia and Perioperative Care, University of California San Francisco, San Francisco, CA 4143, USA
| | - Christine Lee
- Edwards Lifesciences, Critical Care, Irvine, CA 92614, USA
| | - Romain Piracchio
- Department of Anesthesia and Perioperative Care, University of California San Francisco, San Francisco, CA 4143, USA
| | - Leo Anthony Celi
- Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Department of Medicine, Beth Israel Deaconess Medical Center, Boston, MA 02215, USA; Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
| |
Collapse
|
26
|
Dhiman P, Ma J, Andaur Navarro CL, Speich B, Bullock G, Damen JAA, Hooft L, Kirtley S, Riley RD, Van Calster B, Moons KGM, Collins GS. Risk of bias of prognostic models developed using machine learning: a systematic review in oncology. Diagn Progn Res 2022; 6:13. [PMID: 35794668 PMCID: PMC9261114 DOI: 10.1186/s41512-022-00126-w] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Accepted: 02/07/2022] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Prognostic models are used widely in the oncology domain to guide medical decision-making. Little is known about the risk of bias of prognostic models developed using machine learning and the barriers to their clinical uptake in the oncology domain. METHODS We conducted a systematic review and searched MEDLINE and EMBASE databases for oncology-related studies developing a prognostic model using machine learning methods published between 01/01/2019 and 05/09/2019. The primary outcome was risk of bias, judged using the Prediction model Risk Of Bias ASsessment Tool (PROBAST). We described risk of bias overall and for each domain, by development and validation analyses separately. RESULTS We included 62 publications (48 development-only; 14 development with validation). 152 models were developed across all publications and 37 models were validated. 84% (95% CI: 77 to 89) of developed models and 51% (95% CI: 35 to 67) of validated models were at overall high risk of bias. Bias introduced in the analysis was the largest contributor to the overall risk of bias judgement for model development and validation. 123 (81%, 95% CI: 73.8 to 86.4) developed models and 19 (51%, 95% CI: 35.1 to 67.3) validated models were at high risk of bias due to their analysis, mostly due to shortcomings in the analysis including insufficient sample size and split-sample internal validation. CONCLUSIONS The quality of machine learning based prognostic models in the oncology domain is poor and most models have a high risk of bias, contraindicating their use in clinical practice. Adherence to better standards is urgently needed, with a focus on sample size estimation and analysis methods, to improve the quality of these models.
Collapse
Affiliation(s)
- Paula Dhiman
- Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, OX3 7LD, UK.
- NIHR Oxford Biomedical Research Centre, Oxford University Hospitals NHS Foundation Trust, Oxford, UK.
| | - Jie Ma
- Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, OX3 7LD, UK
| | - Constanza L Andaur Navarro
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
- Cochrane Netherlands, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Benjamin Speich
- Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, OX3 7LD, UK
- Department of Clinical Research, Basel Institute for Clinical Epidemiology and Biostatistics, University Hospital Basel, University of Basel, Basel, Switzerland
| | - Garrett Bullock
- Nuffield Department of Orthopaedics, Rheumatology, and Musculoskeletal Sciences, University of Oxford, Oxford, UK
| | - Johanna A A Damen
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
- Cochrane Netherlands, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Lotty Hooft
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
- Cochrane Netherlands, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Shona Kirtley
- Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, OX3 7LD, UK
| | - Richard D Riley
- Centre for Prognosis Research, School of Medicine, Keele University, Staffordshire, ST5 5BG, UK
| | - Ben Van Calster
- Department of Development and Regeneration, KU Leuven, Louvain, Belgium
- Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, the Netherlands
- EPI-Centre, KU Leuven, Louvain, Belgium
| | - Karel G M Moons
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
- Cochrane Netherlands, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Gary S Collins
- Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, OX3 7LD, UK
- NIHR Oxford Biomedical Research Centre, Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| |
Collapse
|
27
|
Machine learning predicts clinically significant health related quality of life improvement after sensorimotor rehabilitation interventions in chronic stroke. Sci Rep 2022; 12:11235. [PMID: 35787657 PMCID: PMC9253044 DOI: 10.1038/s41598-022-14986-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Accepted: 06/16/2022] [Indexed: 12/04/2022] Open
Abstract
Health related quality of life (HRQOL) reflects individuals perceived of wellness in health domains and is often deteriorated after stroke. Precise prediction of HRQOL changes after rehabilitation interventions is critical for optimizing stroke rehabilitation efficiency and efficacy. Machine learning (ML) has become a promising outcome prediction approach because of its high accuracy and easiness to use. Incorporating ML models into rehabilitation practice may facilitate efficient and accurate clinical decision making. Therefore, this study aimed to determine if ML algorithms could accurately predict clinically significant HRQOL improvements after stroke sensorimotor rehabilitation interventions and identify important predictors. Five ML algorithms including the random forest (RF), k-nearest neighbors (KNN), artificial neural network, support vector machine and logistic regression were used. Datasets from 132 people with chronic stroke were included. The Stroke Impact Scale was used for assessing multi-dimensional and global self-perceived HRQOL. Potential predictors included personal characteristics and baseline cognitive/motor/sensory/functional/HRQOL attributes. Data were divided into training and test sets. Tenfold cross-validation procedure with the training data set was used for developing models. The test set was used for determining model performance. Results revealed that RF was effective at predicting multidimensional HRQOL (accuracy: 85%; area under the receiver operating characteristic curve, AUC-ROC: 0.86) and global perceived recovery (accuracy: 80%; AUC-ROC: 0.75), and KNN was effective at predicting global perceived recovery (accuracy: 82.5%; AUC-ROC: 0.76). Age/gender, baseline HRQOL, wrist/hand muscle function, arm movement efficiency and sensory function were identified as crucial predictors. Our study indicated that RF and KNN outperformed the other three models on predicting HRQOL recovery after sensorimotor rehabilitation in stroke patients and could be considered for future clinical application.
Collapse
|
28
|
Vieira BH, Pamplona GSP, Fachinello K, Silva AK, Foss MP, Salmon CEG. On the prediction of human intelligence from neuroimaging: A systematic review of methods and reporting. INTELLIGENCE 2022. [DOI: 10.1016/j.intell.2022.101654] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
29
|
Wang W, Rudd AG, Wang Y, Curcin V, Wolfe CD, Peek N, Bray B. Risk prediction of 30-day mortality after stroke using machine learning: a nationwide registry-based cohort study. BMC Neurol 2022; 22:195. [PMID: 35624434 PMCID: PMC9137068 DOI: 10.1186/s12883-022-02722-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Accepted: 05/17/2022] [Indexed: 12/16/2022] Open
Abstract
Backgrounds We aimed to develop and validate machine learning (ML) models for 30-day stroke mortality for mortality risk stratification and as benchmarking models for quality improvement in stroke care. Methods Data from the UK Sentinel Stroke National Audit Program between 2013 to 2019 were used. Models were developed using XGBoost, Logistic Regression (LR), LR with elastic net with/without interaction terms using 80% randomly selected admissions from 2013 to 2018, validated on the 20% remaining admissions, and temporally validated on 2019 admissions. The models were developed with 30 variables. A reference model was developed using LR and 4 variables. Performances of all models was evaluated in terms of discrimination, calibration, reclassification, Brier scores and Decision-curves. Results In total, 488,497 stroke patients with a 12.3% 30-day mortality rate were included in the analysis. In 2019 temporal validation set, XGBoost model obtained the lowest Brier score (0.069 (95% CI: 0.068–0.071)) and the highest area under the ROC curve (AUC) (0.895 (95% CI: 0.891–0.900)) which outperformed LR reference model by 0.04 AUC (p < 0.001) and LR with elastic net and interaction term model by 0.003 AUC (p < 0.001). All models were perfectly calibrated for low (< 5%) and moderate risk groups (5–15%) and ≈1% underestimation for high-risk groups (> 15%). The XGBoost model reclassified 1648 (8.1%) low-risk cases by the LR reference model as being moderate or high-risk and gained the most net benefit in decision curve analysis. Conclusions All models with 30 variables are potentially useful as benchmarking models in stroke-care quality improvement with ML slightly outperforming others. Supplementary Information The online version contains supplementary material available at 10.1186/s12883-022-02722-1.
Collapse
Affiliation(s)
- Wenjuan Wang
- School of Population Health & Environmental Sciences, Faculty of Life Science and Medicine, King's College London, London, UK.
| | - Anthony G Rudd
- School of Population Health & Environmental Sciences, Faculty of Life Science and Medicine, King's College London, London, UK
| | - Yanzhong Wang
- School of Population Health & Environmental Sciences, Faculty of Life Science and Medicine, King's College London, London, UK.,NIHR Biomedical Research Centre, Guy's and St Thomas' NHS Foundation Trust and King's College London, London, UK.,NIHR Applied Research Collaboration (ARC) South London, London, UK
| | - Vasa Curcin
- School of Population Health & Environmental Sciences, Faculty of Life Science and Medicine, King's College London, London, UK.,NIHR Biomedical Research Centre, Guy's and St Thomas' NHS Foundation Trust and King's College London, London, UK.,NIHR Applied Research Collaboration (ARC) South London, London, UK
| | - Charles D Wolfe
- School of Population Health & Environmental Sciences, Faculty of Life Science and Medicine, King's College London, London, UK.,NIHR Biomedical Research Centre, Guy's and St Thomas' NHS Foundation Trust and King's College London, London, UK.,NIHR Applied Research Collaboration (ARC) South London, London, UK
| | - Niels Peek
- Division of Informatics, Imaging and Data Science, School of Health Sciences, University of Manchester, Manchester, UK.,NIHR Manchester Biomedical Research Centre, University of Manchester, Manchester Academic Health Science Centre, Manchester, UK
| | - Benjamin Bray
- School of Population Health & Environmental Sciences, Faculty of Life Science and Medicine, King's College London, London, UK
| |
Collapse
|
30
|
Polce EM, Kunze KN, Dooley MS, Piuzzi NS, Boettner F, Sculco PK. Efficacy and Applications of Artificial Intelligence and Machine Learning Analyses in Total Joint Arthroplasty: A Call for Improved Reporting. J Bone Joint Surg Am 2022; 104:821-832. [PMID: 35045061 DOI: 10.2106/jbjs.21.00717] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
BACKGROUND There has been a considerable increase in total joint arthroplasty (TJA) research using machine learning (ML). Therefore, the purposes of this study were to synthesize the applications and efficacies of ML reported in the TJA literature, and to assess the methodological quality of these studies. METHODS PubMed, OVID/MEDLINE, and Cochrane libraries were queried in January 2021 for articles regarding the use of ML in TJA. Study demographics, topic, primary and secondary outcomes, ML model development and testing, and model presentation and validation were recorded. The TRIPOD (Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis) guidelines were used to assess the methodological quality. RESULTS Fifty-five studies were identified: 31 investigated clinical outcomes and resource utilization; 11, activity and motion surveillance; 10, imaging detection; and 3, natural language processing. For studies reporting the area under the receiver operating characteristic curve (AUC), the median AUC (and range) was 0.80 (0.60 to 0.97) among 26 clinical outcome studies, 0.99 (0.83 to 1.00) among 6 imaging-based studies, and 0.88 (0.76 to 0.98) among 3 activity and motion surveillance studies. Twelve studies compared ML to logistic regression, with 9 (75%) reporting that ML was superior. The average number of TRIPOD guidelines met was 11.5 (range: 5 to 18), with 38 (69%) meeting greater than half of the criteria. Presentation and explanation of the full model for individual predictions and assessments of model calibration were poorly reported (<30%). CONCLUSIONS The performance of ML models was good to excellent when applied to a wide variety of clinically relevant outcomes in TJA. However, reporting of certain key methodological and model presentation criteria was inadequate. Despite the recent surge in TJA literature utilizing ML, the lack of consistent adherence to reporting guidelines needs to be addressed to bridge the gap between model development and clinical implementation.
Collapse
Affiliation(s)
- Evan M Polce
- University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin
| | - Kyle N Kunze
- Department of Orthopaedic Surgery, Hospital for Special Surgery, New York, NY
| | - Matthew S Dooley
- University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin
| | - Nicolas S Piuzzi
- Department of Orthopaedic Surgery, Cleveland Clinic, Cleveland, Ohio
| | - Friedrich Boettner
- Department of Orthopaedic Surgery, Hospital for Special Surgery, New York, NY
| | - Peter K Sculco
- Department of Orthopaedic Surgery, Hospital for Special Surgery, New York, NY
| |
Collapse
|
31
|
Dhiman P, Ma J, Andaur Navarro CL, Speich B, Bullock G, Damen JAA, Hooft L, Kirtley S, Riley RD, Van Calster B, Moons KGM, Collins GS. Methodological conduct of prognostic prediction models developed using machine learning in oncology: a systematic review. BMC Med Res Methodol 2022; 22:101. [PMID: 35395724 PMCID: PMC8991704 DOI: 10.1186/s12874-022-01577-x] [Citation(s) in RCA: 29] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2021] [Accepted: 03/18/2022] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Describe and evaluate the methodological conduct of prognostic prediction models developed using machine learning methods in oncology. METHODS We conducted a systematic review in MEDLINE and Embase between 01/01/2019 and 05/09/2019, for studies developing a prognostic prediction model using machine learning methods in oncology. We used the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) statement, Prediction model Risk Of Bias ASsessment Tool (PROBAST) and CHecklist for critical Appraisal and data extraction for systematic Reviews of prediction Modelling Studies (CHARMS) to assess the methodological conduct of included publications. Results were summarised by modelling type: regression-, non-regression-based and ensemble machine learning models. RESULTS Sixty-two publications met inclusion criteria developing 152 models across all publications. Forty-two models were regression-based, 71 were non-regression-based and 39 were ensemble models. A median of 647 individuals (IQR: 203 to 4059) and 195 events (IQR: 38 to 1269) were used for model development, and 553 individuals (IQR: 69 to 3069) and 50 events (IQR: 17.5 to 326.5) for model validation. A higher number of events per predictor was used for developing regression-based models (median: 8, IQR: 7.1 to 23.5), compared to alternative machine learning (median: 3.4, IQR: 1.1 to 19.1) and ensemble models (median: 1.7, IQR: 1.1 to 6). Sample size was rarely justified (n = 5/62; 8%). Some or all continuous predictors were categorised before modelling in 24 studies (39%). 46% (n = 24/62) of models reporting predictor selection before modelling used univariable analyses, and common method across all modelling types. Ten out of 24 models for time-to-event outcomes accounted for censoring (42%). A split sample approach was the most popular method for internal validation (n = 25/62, 40%). Calibration was reported in 11 studies. Less than half of models were reported or made available. CONCLUSIONS The methodological conduct of machine learning based clinical prediction models is poor. Guidance is urgently needed, with increased awareness and education of minimum prediction modelling standards. Particular focus is needed on sample size estimation, development and validation analysis methods, and ensuring the model is available for independent validation, to improve quality of machine learning based clinical prediction models.
Collapse
Affiliation(s)
- Paula Dhiman
- Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, OX3 7LD, UK.
- NIHR Oxford Biomedical Research Centre, Oxford University Hospitals NHS Foundation Trust, Oxford, UK.
| | - Jie Ma
- Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, OX3 7LD, UK
| | - Constanza L Andaur Navarro
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
- Cochrane Netherlands, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Benjamin Speich
- Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, OX3 7LD, UK
- Basel Institute for Clinical Epidemiology and Biostatistics, Department of Clinical Research, University Hospital Basel, University of Basel, Basel, Switzerland
| | - Garrett Bullock
- Nuffield Department of Orthopaedics, Rheumatology, and Musculoskeletal Sciences, University of Oxford, Oxford, UK
| | - Johanna A A Damen
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
- Cochrane Netherlands, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Lotty Hooft
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
- Cochrane Netherlands, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Shona Kirtley
- Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, OX3 7LD, UK
| | - Richard D Riley
- Centre for Prognosis Research, School of Medicine, Keele University, Staffordshire, ST5 5BG, UK
| | - Ben Van Calster
- Department of Development and Regeneration, KU Leuven, Leuven, Belgium
- Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, the Netherlands
- EPI-centre, KU Leuven, Leuven, Belgium
| | - Karel G M Moons
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
- Cochrane Netherlands, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Gary S Collins
- Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, OX3 7LD, UK
- NIHR Oxford Biomedical Research Centre, Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| |
Collapse
|
32
|
Kurtz P, Peres IT, Soares M, Salluh JIF, Bozza FA. Hospital Length of Stay and 30-Day Mortality Prediction in Stroke: A Machine Learning Analysis of 17,000 ICU Admissions in Brazil. Neurocrit Care 2022; 37:313-321. [PMID: 35381967 DOI: 10.1007/s12028-022-01486-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2021] [Accepted: 03/07/2022] [Indexed: 10/18/2022]
Abstract
BACKGROUND Hospital length of stay and mortality are associated with resource use and clinical severity, respectively, in patients admitted to the intensive care unit (ICU) with acute stroke. We proposed a structured data-driven methodology to develop length of stay and 30-day mortality prediction models in a large multicenter Brazilian ICU cohort. METHODS We analyzed data from 130 ICUs from 43 Brazilian hospitals. All consecutive adult patients admitted with stroke (ischemic or nontraumatic hemorrhagic) to the ICU from January 2011 to December 2020 were included. Demographic data, comorbidities, acute disease characteristics, organ support, and laboratory data were retrospectively analyzed by a data-driven methodology, which included seven different types of machine learning models applied to training and test sets of data. The best performing models, based on discrimination and calibration measures, are reported as the main results. Outcomes were hospital length of stay and 30-day in-hospital mortality. RESULTS Of 17,115 ICU admissions for stroke, 16,592 adult patients (13,258 ischemic and 3334 hemorrhagic) were analyzed; 4298 (26%) patients had a prolonged hospital length of stay (> 14 days), and 30-day mortality was 8% (n = 1392). Prolonged hospital length of stay was best predicted by the random forests model (Brier score = 0.17, area under the curve = 0.73, positive predictive value = 0.61, negative predictive value = 0.78). Mortality prediction also yielded the best discrimination and calibration through random forests (Brier score = 0.05, area under the curve = 0.90, positive predictive value = 0.66, negative predictive value = 0.94). Among the 20 strongest contributor variables in both models were (1) premorbid conditions (e.g., functional impairment), (2) multiple organ dysfunction parameters (e.g., hypotension, mechanical ventilation), and (3) acute neurological aspects of stroke (e.g., Glasgow coma scale score on admission, stroke type). CONCLUSIONS Hospital length of stay and 30-day mortality of patients admitted to the ICU with stroke were accurately predicted through machine learning methods, even in the absence of stroke-specific data, such as the National Institutes of Health Stroke Scale score or neuroimaging findings. The proposed methods using general intensive care databases may be used for resource use allocation planning and performance assessment of ICUs treating stroke. More detailed acute neurological and management data, as well as long-term functional outcomes, may improve the accuracy and applicability of future machine-learning-based prediction algorithms.
Collapse
Affiliation(s)
- Pedro Kurtz
- D'Or Institute for Research and Education, Rio de Janeiro, Brazil.,Hospital Copa Star, Rio de Janeiro, Brazil.,Paulo Niemeyer State Brain Institute, Rio de Janeiro, Brazil
| | - Igor Tona Peres
- Department of Industrial Engineering, Pontifical Catholic University of Rio de Janeiro, Rio de Janeiro, Brazil
| | - Marcio Soares
- D'Or Institute for Research and Education, Rio de Janeiro, Brazil
| | - Jorge I F Salluh
- D'Or Institute for Research and Education, Rio de Janeiro, Brazil.,Postgraduate Program of Internal Medicine, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil
| | - Fernando A Bozza
- D'Or Institute for Research and Education, Rio de Janeiro, Brazil. .,National Institute of Infectious Disease Evandro Chagas, Oswaldo Cruz Foundation, Rio de Janeiro, Brazil.
| |
Collapse
|
33
|
Delpino F, Costa Â, Farias S, Chiavegatto Filho A, Arcêncio R, Nunes B. Machine learning for predicting chronic diseases: a systematic review. Public Health 2022; 205:14-25. [DOI: 10.1016/j.puhe.2022.01.007] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2021] [Revised: 10/26/2021] [Accepted: 01/11/2022] [Indexed: 12/12/2022]
|
34
|
Huang AW, Haslberger M, Coulibaly N, Galárraga O, Oganisian A, Belbasis L, Panagiotou OA. Multivariable prediction models for health care spending using machine learning: a protocol of a systematic review. Diagn Progn Res 2022; 6:4. [PMID: 35321760 PMCID: PMC8943988 DOI: 10.1186/s41512-022-00119-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Accepted: 01/18/2022] [Indexed: 12/11/2022] Open
Abstract
BACKGROUND With rising cost pressures on health care systems, machine-learning (ML)-based algorithms are increasingly used to predict health care costs. Despite their potential advantages, the successful implementation of these methods could be undermined by biases introduced in the design, conduct, or analysis of studies seeking to develop and/or validate ML models. The utility of such models may also be negatively affected by poor reporting of these studies. In this systematic review, we aim to evaluate the reporting quality, methodological characteristics, and risk of bias of ML-based prediction models for individual-level health care spending. METHODS We will systematically search PubMed and Embase to identify studies developing, updating, or validating ML-based models to predict an individual's health care spending for any medical condition, over any time period, and in any setting. We will exclude prediction models of aggregate-level health care spending, models used to infer causality, models using radiomics or speech parameters, models of non-clinically validated predictors (e.g., genomics), and cost-effectiveness analyses without predicting individual-level health care spending. We will extract data based on the Checklist for Critical Appraisal and Data Extraction for Systematic Reviews of Prediction Modeling Studies (CHARMS), previously published research, and relevant recommendations. We will assess the adherence of ML-based studies to the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) statement and examine the inclusion of transparency and reproducibility indicators (e.g. statements on data sharing). To assess the risk of bias, we will apply the Prediction model Risk Of Bias Assessment Tool (PROBAST). Findings will be stratified by study design, ML methods used, population characteristics, and medical field. DISCUSSION Our systematic review will appraise the quality, reporting, and risk of bias of ML-based models for individualized health care cost prediction. This review will provide an overview of the available models and give insights into the strengths and limitations of using ML methods for the prediction of health spending.
Collapse
Affiliation(s)
- Andrew W Huang
- Department of Health Services, Policy and Practice, Brown University School of Public Health, Rhode Island, Providence, USA.
| | - Martin Haslberger
- QUEST Center, Berlin Institute of Health, Charité-Universitätsmedizin Berlin, Berlin, Germany
| | - Neto Coulibaly
- Department of Health Services, Policy and Practice, Brown University School of Public Health, Rhode Island, Providence, USA
| | - Omar Galárraga
- Department of Health Services, Policy and Practice, Brown University School of Public Health, Rhode Island, Providence, USA
| | - Arman Oganisian
- Department of Biostatistics, Brown University School of Public Health, Providence, Rhode Island, USA
| | - Lazaros Belbasis
- Meta-Research Innovation Center Berlin, QUEST Center, Berlin Institute of Health, Charité-Universitätsmedizin Berlin, Berlin, Germany
| | - Orestis A Panagiotou
- Department of Health Services, Policy and Practice, Brown University School of Public Health, Rhode Island, Providence, USA
- Center for Evidence Synthesis in Health, Brown University School of Public Health, Providence, Rhode Island, USA
| |
Collapse
|
35
|
Filipow N, Main E, Sebire NJ, Booth J, Taylor AM, Davies G, Stanojevic S. Implementation of prognostic machine learning algorithms in paediatric chronic respiratory conditions: a scoping review. BMJ Open Respir Res 2022; 9:9/1/e001165. [PMID: 35297371 PMCID: PMC8928277 DOI: 10.1136/bmjresp-2021-001165] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2021] [Accepted: 03/06/2022] [Indexed: 11/23/2022] Open
Abstract
Machine learning (ML) holds great potential for predicting clinical outcomes in heterogeneous chronic respiratory diseases (CRD) affecting children, where timely individualised treatments offer opportunities for health optimisation. This paper identifies rate-limiting steps in ML prediction model development that impair clinical translation and discusses regulatory, clinical and ethical considerations for ML implementation. A scoping review of ML prediction models in paediatric CRDs was undertaken using the PRISMA extension scoping review guidelines. From 1209 results, 25 articles published between 2013 and 2021 were evaluated for features of a good clinical prediction model using the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) guidelines. Most of the studies were in asthma (80%), with few in cystic fibrosis (12%), bronchiolitis (4%) and childhood wheeze (4%). There were inconsistencies in model reporting and studies were limited by a lack of validation, and absence of equations or code for replication. Clinician involvement during ML model development is essential and diversity, equity and inclusion should be assessed at each step of the ML pipeline to ensure algorithms do not promote or amplify health disparities among marginalised groups. As ML prediction studies become more frequent, it is important that models are rigorously developed using published guidelines and take account of regulatory frameworks which depend on model complexity, patient safety, accountability and liability.
Collapse
Affiliation(s)
- Nicole Filipow
- UCL Great Ormond Street Institute of Child Health, University College London, London, UK
| | - Eleanor Main
- UCL Great Ormond Street Institute of Child Health, University College London, London, UK
| | - Neil J Sebire
- Population, Policy and Practice Research and Teaching Department, UCL Great Ormond Street Institute of Child Health, University College London, London, UK.,GOSH NIHR BRC, Great Ormond Street Hospital for Children, London, UK
| | - John Booth
- Population, Policy and Practice Research and Teaching Department, UCL Great Ormond Street Institute of Child Health, University College London, London, UK.,GOSH NIHR BRC, Great Ormond Street Hospital for Children, London, UK
| | - Andrew M Taylor
- GOSH NIHR BRC, Great Ormond Street Hospital for Children, London, UK.,Institute of Cardiovascular Science, University College London, London, UK
| | - Gwyneth Davies
- Population, Policy and Practice Research and Teaching Department, UCL Great Ormond Street Institute of Child Health, University College London, London, UK.,GOSH NIHR BRC, Great Ormond Street Hospital for Children, London, UK
| | - Sanja Stanojevic
- Community Health and Epidemiology, Dalhousie University, Halifax, Nova Scotia, Canada
| |
Collapse
|
36
|
The Allure of Big Data to Improve Stroke Outcomes: Review of Current Literature. Curr Neurol Neurosci Rep 2022; 22:151-160. [PMID: 35274192 PMCID: PMC8913242 DOI: 10.1007/s11910-022-01180-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/23/2021] [Indexed: 11/03/2022]
Abstract
PURPOSE OF REVIEW To critically appraise literature on recent advances and methods using "big data" to evaluate stroke outcomes and associated factors. RECENT FINDINGS Recent big data studies provided new evidence on the incidence of stroke outcomes, and important emerging predictors of these outcomes. Main highlights included the identification of COVID-19 infection and exposure to a low-dose particulate matter as emerging predictors of mortality post-stroke. Demographic (age, sex) and geographical (rural vs. urban) disparities in outcomes were also identified. There was a surge in methodological (e.g., machine learning and validation) studies aimed at maximizing the efficiency of big data for improving the prediction of stroke outcomes. However, considerable delays remain between data generation and publication. Big data are driving rapid innovations in research of stroke outcomes, generating novel evidence for bridging practice gaps. Opportunity exists to harness big data to drive real-time improvements in stroke outcomes.
Collapse
|
37
|
Matsui H, Yamana H, Fushimi K, Yasunaga H. Development of Deep Learning Models for Predicting In-Hospital Mortality Using an Administrative Claims Database: Retrospective Cohort Study. JMIR Med Inform 2022; 10:e27936. [PMID: 34997958 PMCID: PMC8881780 DOI: 10.2196/27936] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2021] [Revised: 06/05/2021] [Accepted: 01/02/2022] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND Administrative claims databases have been used widely in studies because they have large sample sizes and are easily available. However, studies using administrative databases lack information on disease severity, so a risk adjustment method needs to be developed. OBJECTIVE We aimed to develop and validate deep learning-based prediction models for in-hospital mortality of acute care patients. METHODS The main model was developed using only administrative claims data (age, sex, diagnoses, and procedures on the day of admission). We also constructed disease-specific models for acute myocardial infarction, heart failure, stroke, and pneumonia using common severity indices for these diseases. Using the Japanese Diagnosis Procedure Combination data from July 2010 to March 2017, we identified 46,665,933 inpatients and divided them into derivation and validation cohorts in a ratio of 95:5. The main model was developed using a 9-layer deep neural network with 4 hidden dense layers that had 1000 nodes and were fully connected to adjacent layers. We evaluated model discrimination ability by an area under the receiver operating characteristic curve (AUC) and calibration ability by calibration plot. RESULTS Among the eligible patients, 2,005,035 (4.3%) died. Discrimination and calibration of the models were satisfactory. The AUC of the main model in the validation cohort was 0.954 (95% CI 0.954-0.955). The main model had higher discrimination ability than the disease-specific models. CONCLUSIONS Our deep learning-based model using diagnoses and procedures produced valid predictions of in-hospital mortality.
Collapse
Affiliation(s)
- Hiroki Matsui
- Department of Clinical Epidemiology and Health Economics, School of Public Health, The University of Tokyo, Tokyo, Japan
| | - Hayato Yamana
- Department of Health Services Research, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan
| | - Kiyohide Fushimi
- Department of Health Policy and Informatics, Tokyo Medical and Dental University Graduate School, Tokyo, Japan
| | - Hideo Yasunaga
- Department of Clinical Epidemiology and Health Economics, School of Public Health, The University of Tokyo, Tokyo, Japan
| |
Collapse
|
38
|
Feghali J, Sattari SA, Wicks EE, Gami A, Rapaport S, Azad TD, Yang W, Xu R, Tamargo RJ, Huang J. External Validation of a Neural Network Model in Aneurysmal Subarachnoid Hemorrhage: A Comparison With Conventional Logistic Regression Models. Neurosurgery 2022; 90:552-561. [PMID: 35113076 DOI: 10.1227/neu.0000000000001857] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2021] [Accepted: 11/10/2021] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND Interest in machine learning (ML)-based predictive modeling has led to the development of models predicting outcomes after aneurysmal subarachnoid hemorrhage (aSAH), including the Nijmegen acute subarachnoid hemorrhage calculator (Nutshell). Generalizability of such models to external data remains unclear. OBJECTIVE To externally validate the performance of the Nutshell tool while comparing it with the conventional Subarachnoid Hemorrhage International Trialists (SAHIT) models and to review the ML literature on outcome prediction after aSAH and aneurysm treatment. METHODS A prospectively maintained database of patients with aSAH presenting consecutively to our institution in the 2013 to 2018 period was used. The web-based Nutshell and SAHIT calculators were used to derive the risks of poor long-term (12-18 months) outcomes and 30-day mortality. Discrimination was evaluated using the area under the curve (AUC), and calibration was investigated using calibration plots. The literature on relevant ML models was surveyed for a synopsis. RESULTS In 269 patients with aSAH, the SAHIT models outperformed the Nutshell tool (AUC: 0.786 vs 0.689, P = .025) in predicting long-term functional outcomes. A logistic regression model of the Nutshell variables derived from our data achieved adequate discrimination (AUC = 0.759) of poor outcomes. The SAHIT models outperformed the Nutshell tool in predicting 30-day mortality (AUC: 0.810 vs 0.636, P < .001). Calibration properties were more favorable for the SAHIT models. Most published aneurysm-related ML-based outcome models lack external validation and usable testing platforms. CONCLUSION The Nutshell tool demonstrated limited performance on external validation in comparison with the SAHIT models. External validation and the dissemination of testing platforms for ML models must be emphasized.
Collapse
Affiliation(s)
- James Feghali
- Department of Neurosurgery, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
| | | | | | | | | | | | | | | | | | | |
Collapse
|
39
|
Groot OQ, Ogink PT, Lans A, Twining PK, Kapoor ND, DiGiovanni W, Bindels BJJ, Bongers MER, Oosterhoff JHF, Karhade AV, Oner FC, Verlaan J, Schwab JH. Machine learning prediction models in orthopedic surgery: A systematic review in transparent reporting. J Orthop Res 2022; 40:475-483. [PMID: 33734466 PMCID: PMC9290012 DOI: 10.1002/jor.25036] [Citation(s) in RCA: 25] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/15/2021] [Revised: 03/10/2021] [Accepted: 03/15/2021] [Indexed: 02/04/2023]
Abstract
Machine learning (ML) studies are becoming increasingly popular in orthopedics but lack a critically appraisal of their adherence to peer-reviewed guidelines. The objective of this review was to (1) evaluate quality and transparent reporting of ML prediction models in orthopedic surgery based on the transparent reporting of multivariable prediction models for individual prognosis or diagnosis (TRIPOD), and (2) assess risk of bias with the Prediction model Risk Of Bias ASsessment Tool. A systematic review was performed to identify all ML prediction studies published in orthopedic surgery through June 18th, 2020. After screening 7138 studies, 59 studies met the study criteria and were included. Two reviewers independently extracted data and discrepancies were resolved by discussion with at least two additional reviewers present. Across all studies, the overall median completeness for the TRIPOD checklist was 53% (interquartile range 47%-60%). The overall risk of bias was low in 44% (n = 26), high in 41% (n = 24), and unclear in 15% (n = 9). High overall risk of bias was driven by incomplete reporting of performance measures, inadequate handling of missing data, and use of small datasets with inadequate outcome numbers. Although the number of ML studies in orthopedic surgery is increasing rapidly, over 40% of the existing models are at high risk of bias. Furthermore, over half incompletely reported their methods and/or performance measures. Until these issues are adequately addressed to give patients and providers trust in ML models, a considerable gap remains between the development of ML prediction models and their implementation in orthopedic practice.
Collapse
Affiliation(s)
- Olivier Q. Groot
- Orthopedic Oncology Service, Department of Orthopedic Surgery, Massachusetts General HospitalHarvard Medical SchoolBostonMassachusettsUSA
| | - Paul T. Ogink
- Department of Orthopedic Surgery, University Medical Center UtrechtUtrecht UniversityUtrechtThe Netherlands
| | - Amanda Lans
- Orthopedic Oncology Service, Department of Orthopedic Surgery, Massachusetts General HospitalHarvard Medical SchoolBostonMassachusettsUSA
| | - Peter K. Twining
- Orthopedic Oncology Service, Department of Orthopedic Surgery, Massachusetts General HospitalHarvard Medical SchoolBostonMassachusettsUSA
| | - Neal D. Kapoor
- Orthopedic Oncology Service, Department of Orthopedic Surgery, Massachusetts General HospitalHarvard Medical SchoolBostonMassachusettsUSA
| | - William DiGiovanni
- Orthopedic Oncology Service, Department of Orthopedic Surgery, Massachusetts General HospitalHarvard Medical SchoolBostonMassachusettsUSA
| | - Bas J. J. Bindels
- Department of Orthopedic Surgery, University Medical Center UtrechtUtrecht UniversityUtrechtThe Netherlands
| | - Michiel E. R. Bongers
- Orthopedic Oncology Service, Department of Orthopedic Surgery, Massachusetts General HospitalHarvard Medical SchoolBostonMassachusettsUSA
| | - Jacobien H. F. Oosterhoff
- Orthopedic Oncology Service, Department of Orthopedic Surgery, Massachusetts General HospitalHarvard Medical SchoolBostonMassachusettsUSA
| | - Aditya V. Karhade
- Orthopedic Oncology Service, Department of Orthopedic Surgery, Massachusetts General HospitalHarvard Medical SchoolBostonMassachusettsUSA
| | - F. C. Oner
- Department of Orthopedic Surgery, University Medical Center UtrechtUtrecht UniversityUtrechtThe Netherlands
| | - Jorrit‐Jan Verlaan
- Department of Orthopedic Surgery, University Medical Center UtrechtUtrecht UniversityUtrechtThe Netherlands
| | - Joseph H. Schwab
- Orthopedic Oncology Service, Department of Orthopedic Surgery, Massachusetts General HospitalHarvard Medical SchoolBostonMassachusettsUSA
| |
Collapse
|
40
|
Zhu B, Zhao J, Cao M, Du W, Yang L, Su M, Tian Y, Wu M, Wu T, Wang M, Zhao X, Zhao Z. Predicting 1-Hour Thrombolysis Effect of r-tPA in Patients With Acute Ischemic Stroke Using Machine Learning Algorithm. Front Pharmacol 2022; 12:759782. [PMID: 35046804 PMCID: PMC8762247 DOI: 10.3389/fphar.2021.759782] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2021] [Accepted: 11/29/2021] [Indexed: 01/01/2023] Open
Abstract
Background: Thrombolysis with r-tPA is recommended for patients after acute ischemic stroke (AIS) within 4.5 h of symptom onset. However, only a few patients benefit from this therapeutic regimen. Thus, we aimed to develop an interpretable machine learning (ML)–based model to predict the thrombolysis effect of r-tPA at the super-early stage. Methods: A total of 353 patients with AIS were divided into training and test data sets. We then used six ML algorithms and a recursive feature elimination (RFE) method to explore the relationship among the clinical variables along with the NIH stroke scale score 1 h after thrombolysis treatment. Shapley additive explanations and local interpretable model–agnostic explanation algorithms were applied to interpret the ML models and determine the importance of the selected features. Results: Altogether, 353 patients with an average age of 63.0 (56.0–71.0) years were enrolled in the study. Of these patients, 156 showed a favorable thrombolysis effect and 197 showed an unfavorable effect. A total of 14 variables were enrolled in the modeling, and 6 ML algorithms were used to predict the thrombolysis effect. After RFE screening, seven variables under the gradient boosting decision tree (GBDT) model (area under the curve = 0.81, specificity = 0.61, sensitivity = 0.9, and F1 score = 0.79) demonstrated the best performance. Of the seven variables, activated partial thromboplastin clotting time (time), B-type natriuretic peptide, and fibrin degradation products were the three most important clinical characteristics that might influence r-tPA efficiency. Conclusion: This study demonstrated that the GBDT model with the seven variables could better predict the early thrombolysis effect of r-tPA.
Collapse
Affiliation(s)
- Bin Zhu
- Department of Pharmacy, Beijing Tiantan Hospital, Capital Medical University, Beijing, China
| | - Jianlei Zhao
- Department of Neurology, The Second Hospital of Lanzhou University, Lanzhou, China
| | - Mingnan Cao
- Department of Pharmacy, Beijing Tiantan Hospital, Capital Medical University, Beijing, China
| | - Wanliang Du
- Department of Neurology, Beijing Tiantan Hospital, Capital Medical University, Beijing, China
| | | | | | - Yue Tian
- Department of Pharmacy, Beijing Tiantan Hospital, Capital Medical University, Beijing, China
| | - Mingfen Wu
- Department of Pharmacy, Beijing Tiantan Hospital, Capital Medical University, Beijing, China
| | - Tingxi Wu
- Department of Pharmacy, Beijing Tiantan Hospital, Capital Medical University, Beijing, China
| | - Manxia Wang
- Department of Neurology, The Second Hospital of Lanzhou University, Lanzhou, China
| | - Xingquan Zhao
- Department of Neurology, Beijing Tiantan Hospital, Capital Medical University, Beijing, China
| | - Zhigang Zhao
- Department of Pharmacy, Beijing Tiantan Hospital, Capital Medical University, Beijing, China
| |
Collapse
|
41
|
Guzman-Vilca WC, Castillo-Cara M, Carrillo-Larco RM. Development, validation and application of a machine learning model to estimate salt consumption in 54 countries. eLife 2022; 11:72930. [PMID: 34984979 PMCID: PMC8789317 DOI: 10.7554/elife.72930] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Accepted: 12/15/2021] [Indexed: 11/13/2022] Open
Abstract
Global targets to reduce salt intake have been proposed but their monitoring is challenged by the lack of population-based data on salt consumption. We developed a machine learning (ML) model to predict salt consumption at the population level based on simple predictors and applied this model to national surveys in 54 countries. We used 21 surveys with spot urine samples for the ML model derivation and validation; we developed a supervised ML regression model based on: sex, age, weight, height, systolic and diastolic blood pressure. We applied the ML model to 54 new surveys to quantify the mean salt consumption in the population. The pooled dataset in which we developed the ML model included 49,776 people. Overall, there were no substantial differences between the observed and ML-predicted mean salt intake (p<0.001). The pooled dataset where we applied the ML model included 166,677 people; the predicted mean salt consumption ranged from 6.8 g/day (95% CI: 6.8-6.8 g/day) in Eritrea to 10.0 g/day (95% CI: 9.9-10.0 g/day) in American Samoa. The countries with the highest predicted mean salt intake were in Western Pacific. The lowest predicted intake was found in Africa. The country-specific predicted mean salt intake was within reasonable difference from the best available evidence. A ML model based on readily available predictors estimated daily salt consumption with good accuracy. This model could be used to predict mean salt consumption in the general population where urine samples are not available.
Collapse
|
42
|
Alexopoulos G, Zhang J, Karampelas I, Khan M, Quadri N, Patel M, Patel N, Almajali M, Mattei TA, Kemp J, Coppens J, Mercier P. Applied forecasting for delayed cerebral ischemia prediction post subarachnoid hemorrhage: Methodological fallacies. INFORMATICS IN MEDICINE UNLOCKED 2022. [DOI: 10.1016/j.imu.2021.100817] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022] Open
|
43
|
Lee DY, Kim C, Lee S, Son SJ, Cho SM, Cho YH, Lim J, Park RW. Psychosis Relapse Prediction Leveraging Electronic Health Records Data and Natural Language Processing Enrichment Methods. Front Psychiatry 2022; 13:844442. [PMID: 35479497 PMCID: PMC9037331 DOI: 10.3389/fpsyt.2022.844442] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/28/2021] [Accepted: 03/09/2022] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND Identifying patients at a high risk of psychosis relapse is crucial for early interventions. A relevant psychiatric clinical context is often recorded in clinical notes; however, the utilization of unstructured data remains limited. This study aimed to develop psychosis-relapse prediction models using various types of clinical notes and structured data. METHODS Clinical data were extracted from the electronic health records of the Ajou University Medical Center in South Korea. The study population included patients with psychotic disorders, and outcome was psychosis relapse within 1 year. Using only structured data, we developed an initial prediction model, then three natural language processing (NLP)-enriched models using three types of clinical notes (psychological tests, admission notes, and initial nursing assessment) and one complete model. Latent Dirichlet Allocation was used to cluster the clinical context into similar topics. All models applied the least absolute shrinkage and selection operator logistic regression algorithm. We also performed an external validation using another hospital database. RESULTS A total of 330 patients were included, and 62 (18.8%) experienced psychosis relapse. Six predictors were used in the initial model and 10 additional topics from Latent Dirichlet Allocation processing were added in the enriched models. The model derived from all notes showed the highest value of the area under the receiver operating characteristic (AUROC = 0.946) in the internal validation, followed by models based on the psychological test notes, admission notes, initial nursing assessments, and structured data only (0.902, 0.855, 0.798, and 0.784, respectively). The external validation was performed using only the initial nursing assessment note, and the AUROC was 0.616. CONCLUSIONS We developed prediction models for psychosis relapse using the NLP-enrichment method. Models using clinical notes were more effective than models using only structured data, suggesting the importance of unstructured data in psychosis prediction.
Collapse
Affiliation(s)
- Dong Yun Lee
- Department of Biomedical Informatics, Ajou University School of Medicine, Suwon, South Korea
| | - Chungsoo Kim
- Department of Biomedical Sciences, Ajou University Graduate School of Medicine, Suwon, South Korea
| | - Seongwon Lee
- Department of Biomedical Informatics, Ajou University School of Medicine, Suwon, South Korea.,Department of Biomedical Sciences, Ajou University Graduate School of Medicine, Suwon, South Korea
| | - Sang Joon Son
- Department of Psychiatry, Ajou University School of Medicine, Suwon, South Korea
| | - Sun-Mi Cho
- Department of Psychiatry, Ajou University School of Medicine, Suwon, South Korea
| | - Yong Hyuk Cho
- Department of Psychiatry, Ajou University School of Medicine, Suwon, South Korea
| | - Jaegyun Lim
- Department of Laboratory Medicine, Myongji Hospital, Hanyang University College of Medicine, Goyang, South Korea
| | - Rae Woong Park
- Department of Biomedical Informatics, Ajou University School of Medicine, Suwon, South Korea.,Department of Biomedical Sciences, Ajou University Graduate School of Medicine, Suwon, South Korea
| |
Collapse
|
44
|
Data-driven methods for dengue prediction and surveillance using real-world and Big Data: A systematic review. PLoS Negl Trop Dis 2022; 16:e0010056. [PMID: 34995281 PMCID: PMC8740963 DOI: 10.1371/journal.pntd.0010056] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Accepted: 12/06/2021] [Indexed: 12/23/2022] Open
Abstract
Background Traditionally, dengue surveillance is based on case reporting to a central health agency. However, the delay between a case and its notification can limit the system responsiveness. Machine learning methods have been developed to reduce the reporting delays and to predict outbreaks, based on non-traditional and non-clinical data sources. The aim of this systematic review was to identify studies that used real-world data, Big Data and/or machine learning methods to monitor and predict dengue-related outcomes. Methodology/Principal findings We performed a search in PubMed, Scopus, Web of Science and grey literature between January 1, 2000 and August 31, 2020. The review (ID: CRD42020172472) focused on data-driven studies. Reviews, randomized control trials and descriptive studies were not included. Among the 119 studies included, 67% were published between 2016 and 2020, and 39% used at least one novel data stream. The aim of the included studies was to predict a dengue-related outcome (55%), assess the validity of data sources for dengue surveillance (23%), or both (22%). Most studies (60%) used a machine learning approach. Studies on dengue prediction compared different prediction models, or identified significant predictors among several covariates in a model. The most significant predictors were rainfall (43%), temperature (41%), and humidity (25%). The two models with the highest performances were Neural Networks and Decision Trees (52%), followed by Support Vector Machine (17%). We cannot rule out a selection bias in our study because of our two main limitations: we did not include preprints and could not obtain the opinion of other international experts. Conclusions/Significance Combining real-world data and Big Data with machine learning methods is a promising approach to improve dengue prediction and monitoring. Future studies should focus on how to better integrate all available data sources and methods to improve the response and dengue management by stakeholders. Dengue is one of the most important arbovirus infections in the world and its public health, societal and economic burden is increasing. Although the majority of dengue cases are asymptomatic or mild, severe disease forms can lead to death. For this reason, early diagnosis and monitoring of dengue are crucial to decrease mortality. However, most endemic regions still rely on traditional monitoring methods, despite the growing availability of novel data sources and data-driven methods based on real-world data, Big Data, and machine learning algorithms. In this systematic review, we identified and analyzed studies that used these novel approaches for dengue monitoring and/or prediction. We found that novel data streams, such as Internet search engines and social media platforms, and machine learning methods can be successfully used to improve dengue management, but are still vastly ignored in real life. These approaches should be combined with traditional methods to help stakeholders better prepare for each outbreak and improve early responsiveness.
Collapse
|
45
|
AI and Immunoinformatics. Artif Intell Med 2022. [DOI: 10.1007/978-3-030-64573-1_113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
46
|
Nwanosike EM, Conway BR, Merchant HA, Hasan SS. Potential applications and performance of machine learning techniques and algorithms in clinical practice: A systematic review. Int J Med Inform 2021; 159:104679. [PMID: 34990939 DOI: 10.1016/j.ijmedinf.2021.104679] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2021] [Revised: 12/08/2021] [Accepted: 12/27/2021] [Indexed: 12/11/2022]
Abstract
PURPOSE The advent of clinically adapted machine learning algorithms can solve numerous problems ranging from disease diagnosis and prognosis to therapy recommendations. This systematic review examines the performance of machine learning (ML) algorithms and evaluates the progress made to date towards their implementation in clinical practice. METHODS Systematic searching of databases (PubMed, MEDLINE, Scopus, Google Scholar, Cochrane Library and WHO Covid-19 database) to identify original articles published between January 2011 and October 2021. Studies reporting ML techniques in clinical practice involving humans and ML algorithms with a performance metric were considered. RESULTS Of 873 unique articles identified, 36 studies were eligible for inclusion. The XGBoost (extreme gradient boosting) algorithm showed the highest potential for clinical applications (n = 7 studies); this was followed jointly by random forest algorithm, logistic regression, and the support vector machine, respectively (n = 5 studies). Prediction of outcomes (n = 33), in particular Inflammatory diseases (n = 7) received the most attention followed by cancer and neuropsychiatric disorders (n = 5 for each) and Covid-19 (n = 4). Thirty-three out of the thirty-six included studies passed more than 50% of the selected quality assessment criteria in the TRIPOD checklist. In contrast, none of the studies could achieve an ideal overall bias rating of 'low' based on the PROBAST checklist. In contrast, only three studies showed evidence of the deployment of ML algorithm(s) in clinical practice. CONCLUSIONS ML is potentially a reliable tool for clinical decision support. Although advocated widely in clinical practice, work is still in progress to validate clinically adapted ML algorithms. Improving quality standards, transparency, and interpretability of ML models will further lower the barriers to acceptability.
Collapse
Affiliation(s)
- Ezekwesiri Michael Nwanosike
- Department of Pharmacy, School of Applied Sciences, University of Huddersfield, Queensgate Huddersfield HD1 3DH, West Yorkshire, United Kingdom
| | - Barbara R Conway
- Department of Pharmacy, School of Applied Sciences, University of Huddersfield, Queensgate Huddersfield HD1 3DH, West Yorkshire, United Kingdom
| | - Hamid A Merchant
- Department of Pharmacy, School of Applied Sciences, University of Huddersfield, Queensgate Huddersfield HD1 3DH, West Yorkshire, United Kingdom
| | - Syed Shahzad Hasan
- Department of Pharmacy, School of Applied Sciences, University of Huddersfield, Queensgate Huddersfield HD1 3DH, West Yorkshire, United Kingdom; School of Biomedical Sciences & Pharmacy, University of Newcastle, Callaghan, Australia.
| |
Collapse
|
47
|
Kennedy EE, Bowles KH, Aryal S. Systematic review of prediction models for postacute care destination decision-making. J Am Med Inform Assoc 2021; 29:176-186. [PMID: 34757383 PMCID: PMC8714284 DOI: 10.1093/jamia/ocab197] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2021] [Revised: 07/21/2021] [Accepted: 09/01/2021] [Indexed: 01/12/2023] Open
Abstract
OBJECTIVE This article reports a systematic review of studies containing development and validation of models predicting postacute care destination after adult inpatient hospitalization, summarizes clinical populations and variables, evaluates model performance, assesses risk of bias and applicability, and makes recommendations to reduce bias in future models. MATERIALS AND METHODS A systematic literature review was conducted following PRISMA guidelines and the Cochrane Prognosis Methods Group criteria. Online databases were searched in June 2020 to identify all published studies in this area. Data were extracted based on the CHARMS checklist, and studies were evaluated based on predictor variables, validation, performance in validation, risk of bias, and applicability using the Prediction Model Risk of Bias Assessment Tool (PROBAST) tool. RESULTS The final sample contained 28 articles with 35 models for evaluation. Models focused on surgical (22), medical (5), or both (8) populations. Eighteen models were internally validated, 10 were externally validated, and 7 models underwent both types. Model performance varied within and across populations. Most models used retrospective data, the median number of predictors was 8.5, and most models demonstrated risk of bias. DISCUSSION AND CONCLUSION Prediction modeling studies for postacute care destinations are becoming more prolific in the literature, but model development and validation strategies are inconsistent, and performance is variable. Most models are developed using regression, but machine learning methods are increasing in frequency. Future studies should ensure the rigorous variable selection and follow TRIPOD guidelines. Only 14% of the models have been tested or implemented beyond original studies, so translation into practice requires further investigation.
Collapse
Affiliation(s)
- Erin E Kennedy
- NewCourtland Center for Transitions and Health, University of Pennsylvania School of Nursing, Philadelphia, Pennsylvania, USA
- Leonard Davis Institute for Health Economics, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Kathryn H Bowles
- NewCourtland Center for Transitions and Health, University of Pennsylvania School of Nursing, Philadelphia, Pennsylvania, USA
- Leonard Davis Institute for Health Economics, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Subhash Aryal
- Biostatistics, Evaluation, Collaboration, Consultation, and Analysis Lab, University of Pennsylvania School of Nursing, Philadelphia, Pennsylvania, USA
- Department of Family and Community Health, University of Pennsylvania School of Nursing, Philadelphia, Pennsylvania, USA
| |
Collapse
|
48
|
Sánchez-Salmerón R, Gómez-Urquiza JL, Albendín-García L, Correa-Rodríguez M, Martos-Cabrera MB, Velando-Soriano A, Suleiman-Martos N. Machine learning methods applied to triage in emergency services: A systematic review. Int Emerg Nurs 2021; 60:101109. [PMID: 34952482 DOI: 10.1016/j.ienj.2021.101109] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2021] [Revised: 08/23/2021] [Accepted: 10/22/2021] [Indexed: 12/23/2022]
Abstract
BACKGROUND In emergency services is important to accurately assess and classify symptoms, which may be improved with the help of technology. One mechanism that could help and improve predictions from health records or patient flow is machine learning (ML). AIM To analyse the effectiveness of ML systems in triage for making predictions at the emergency department in comparison with other triage scales/scores. METHODS Following the PRISMA recommendations, a systematic review was conducted using CINAHL, Cochrane, Cuiden, Medline and Scopus databases with the search equation "Machine learning AND triage AND emergency". RESULTS Eleven studies were identified. The studies show that the use of ML methods consistently predict important outcomes like mortality, critical care outcomes and admission, and the need for hospitalization in comparison with scales like Emergency Severity Index or others. Among the ML models considered, XGBoost and Deep Neural Networks obtained the highest levels of prediction accuracy, while Logistic Regression performed obtained the worst values. CONCLUSIONS Machine learning methods can be a good instrument for helping triage process with the prediction of important emergency variables like mortality or the need for critical care or hospitalization.
Collapse
Affiliation(s)
| | - José L Gómez-Urquiza
- Faculty of Health Sciences, University of Granada, Avenida de la Ilustración N. 60, 18016 Granada, Spain.
| | - Luis Albendín-García
- Faculty of Health Sciences, University of Granada, Avenida de la Ilustración N. 60, 18016 Granada, Spain.
| | - María Correa-Rodríguez
- Faculty of Health Sciences, University of Granada, Avenida de la Ilustración N. 60, 18016 Granada, Spain.
| | - María Begoña Martos-Cabrera
- San Cecilio Clinical University Hospital, Andalusian Health Service, Avenida del Conocimiento s/n, 18016 Granada, Spain.
| | - Almudena Velando-Soriano
- San Cecilio Clinical University Hospital, Andalusian Health Service, Avenida del Conocimiento s/n, 18016 Granada, Spain.
| | - Nora Suleiman-Martos
- Faculty of Health Sciences, Ceuta University Campus, University of Granada, C/Cortadura del Valle SN, 51001 Ceuta, Spain.
| |
Collapse
|
49
|
Lim MJR, Quek RHC, Ng KJ, Loh NHW, Lwin S, Teo K, Nga VDW, Yeo TT, Motani M. Machine Learning Models Prognosticate Functional Outcomes Better than Clinical Scores in Spontaneous Intracerebral Haemorrhage. J Stroke Cerebrovasc Dis 2021; 31:106234. [PMID: 34896819 DOI: 10.1016/j.jstrokecerebrovasdis.2021.106234] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2021] [Revised: 11/11/2021] [Accepted: 11/17/2021] [Indexed: 10/19/2022] Open
Abstract
OBJECTIVE This study aims to develop and compare the use of deep neural networks (DNN) and support vector machines (SVM) to clinical prognostic scores for prognosticating 30-day mortality and 90-day poor functional outcome (PFO) in spontaneous intracerebral haemorrhage (SICH). MATERIALS AND METHODS We conducted a retrospective cohort study of 297 SICH patients between December 2014 and May 2016. Clinical data was collected from electronic medical records using standardized data collection forms. The machine learning workflow included imputation of missing data, dimensionality reduction, imbalanced-class correction, and evaluation using cross-validation and comparison of accuracy against clinical prognostic scores. RESULTS 32 (11%) patients had 30-day mortality while 177 (63%) patients had 90-day PFO. For prognosticating 30-day mortality, the class-balanced accuracies for DNN (0.875; 95% CI 0.800-0.950; McNemar's p-value 1.000) and SVM (0.848; 95% CI 0.767-0.930; McNemar's p-value 0.791) were comparable to that of the original ICH score (0.833; 95% CI 0.748-0.918). The c-statistics for DNN (0.895; DeLong's p-value 0.715), and SVM (0.900; DeLong's p-value 0.619), though greater than that of the original ICH score (0.862), were not significantly different. For prognosticating 90-day PFO, the class-balanced accuracies for DNN (0.853; 95% CI 0.772-0.934; McNemar's p-value 0.003) and SVM (0.860; 95% CI 0.781-0.939; McNemar's p-value 0.004) were better than that of the ICH-Grading Scale (0.706; 95% CI 0.600-0.812). The c-statistic for SVM (0.883; DeLong's p-value 0.022) was significantly greater than that of the ICH-Grading Scale (0.778), while the c-statistic for DNN was 0.864 (DeLong's p-value 0.055). CONCLUSION We showed that the SVM model performs significantly better than clinical prognostic scores in predicting 90-day PFO in SICH.
Collapse
Affiliation(s)
- Mervyn Jun Rui Lim
- Division of Neurosurgery, University Surgical Centre, National University Hospital, Singapore.
| | | | - Kai Jie Ng
- Yong Loo Lin School of Medicine, National University of Singapore
| | - Ne-Hooi Will Loh
- Department of Anaesthesia, National University Hospital, Singapore
| | - Sein Lwin
- Division of Neurosurgery, University Surgical Centre, National University Hospital, Singapore
| | - Kejia Teo
- Division of Neurosurgery, University Surgical Centre, National University Hospital, Singapore
| | - Vincent Diong Weng Nga
- Division of Neurosurgery, University Surgical Centre, National University Hospital, Singapore
| | - Tseng Tsai Yeo
- Division of Neurosurgery, University Surgical Centre, National University Hospital, Singapore
| | - Mehul Motani
- Department of Electrical and Computer Engineering, National University of Singapore; N.1 Institute for Health, National University of Singapore; Institute for Data Science, National University of Singapore
| |
Collapse
|
50
|
Rahim AIA, Ibrahim MI, Chua SL, Musa KI. Hospital Facebook Reviews Analysis Using a Machine Learning Sentiment Analyzer and Quality Classifier. Healthcare (Basel) 2021; 9:1679. [PMID: 34946405 PMCID: PMC8701188 DOI: 10.3390/healthcare9121679] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2021] [Revised: 11/30/2021] [Accepted: 12/02/2021] [Indexed: 02/05/2023] Open
Abstract
While experts have recognised the significance and necessity of social media integration in healthcare, no systematic method has been devised in Malaysia or Southeast Asia to include social media input into the hospital quality improvement process. The goal of this work is to explain how to develop a machine learning system for classifying Facebook reviews of public hospitals in Malaysia by using service quality (SERVQUAL) dimensions and sentiment analysis. We developed a Machine Learning Quality Classifier (MLQC) based on the SERVQUAL model and a Machine Learning Sentiment Analyzer (MLSA) by manually annotated multiple batches of randomly chosen reviews. Logistic regression (LR), naive Bayes (NB), support vector machine (SVM), and other methods were used to train the classifiers. The performance of each classifier was tested using 5-fold cross validation. For topic classification, the average F1-score was between 0.687 and 0.757 for all models. In a 5-fold cross validation of each SERVQUAL dimension and in sentiment analysis, SVM consistently outperformed other methods. The study demonstrates how to use supervised learning to automatically identify SERVQUAL domains and sentiments from patient experiences on a hospital's Facebook page. Malaysian healthcare providers can gather and assess data on patient care via the use of these content analysis technology to improve hospital quality of care.
Collapse
Affiliation(s)
- Afiq Izzudin A. Rahim
- Department of Community Medicine, School of Medical Science, Universiti Sains Malaysia, Kubang Kerian, Kota Bharu 16150, Kelantan, Malaysia; (A.I.A.R.); (K.I.M.)
| | - Mohd Ismail Ibrahim
- Department of Community Medicine, School of Medical Science, Universiti Sains Malaysia, Kubang Kerian, Kota Bharu 16150, Kelantan, Malaysia; (A.I.A.R.); (K.I.M.)
| | - Sook-Ling Chua
- Faculty of Computing and Informatics, Multimedia University, Persiaran Multimedia, Cyberjaya 63100, Selangor, Malaysia
| | - Kamarul Imran Musa
- Department of Community Medicine, School of Medical Science, Universiti Sains Malaysia, Kubang Kerian, Kota Bharu 16150, Kelantan, Malaysia; (A.I.A.R.); (K.I.M.)
| |
Collapse
|