1
|
Heseltine-Carp W, Courtman M, Browning D, Kasabe A, Allen M, Streeter A, Ifeachor E, James M, Mullin S. Machine learning to predict stroke risk from routine hospital data: A systematic review. Int J Med Inform 2025; 196:105811. [PMID: 39908727 DOI: 10.1016/j.ijmedinf.2025.105811] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2024] [Revised: 01/20/2025] [Accepted: 01/23/2025] [Indexed: 02/07/2025]
Abstract
PURPOSE Stroke remains a leading cause of morbidity and mortality. Despite this, current risk stratification tools such as CHA2DS2-VASc and QRISK3 are of limited accuracy, particularly in those without a diagnosis of atrial-fibrillation. Hence, there is a need for more accurate stroke risk prediction models. Machine-learning (ML) may provide a solution to this by leveraging existing routine hospital databases to build accurate stroke risk prediction models and identify novel risk factors for stroke. AIMS In this systematic review we appraise current research using ML to predict stroke risk from routine hospital data. Based on these findings we then highlight common methodological limitations and recommendations for future research. METHODS In this review we identify 49 original research (38 in the general population and 11 in AF specific populations) articles from the PUBMED database from January-2013 to December-2024 using ML and routine hospital data to predict the risk of stroke. RESULTS ML models were able to accurately predict stroke risk in both AF specific and general populations, with AUCs ranging from 0.64 to 0.99. Where tested, ML also consistently outperformed traditional risk stratification tool, such as CHA2DS2-VASc. ML also appeared useful in identifying several novel risk factors from electrocardiogram, laboratory test and echocardiography data. However, the quality of datasets were often limited, there was a high suspicion of overfitting and models often lacked calibration, external validation and explainability analysis. CONCLUSION Whilst ML has shown great potential in stroke prediction and identifying novel risk factors for stroke, improvements in study methodology is required prior to integration of ML into routine healthcare. Future research should adhere to the EQUATOR guidance on prediction models and encourage interdisciplinary collaboration between computer scientists and clinicians. Further prospective RCTs are also required to validate models in the clinical setting and the identify barriers of integrating ML into routine healthcare.
Collapse
Affiliation(s)
- William Heseltine-Carp
- University of Plymouth, Room N6, ITTC Building, Plymouth Science Park, Plymouth PL68BX, UK.
| | - Megan Courtman
- University of Plymouth, Room N6, ITTC Building, Plymouth Science Park, Plymouth PL68BX, UK; University of Plymouth, Plymouth PL4 8AA, UK.
| | - Daniel Browning
- University of Plymouth, Room N6, ITTC Building, Plymouth Science Park, Plymouth PL68BX, UK.
| | - Aishwarya Kasabe
- University of Plymouth, Room N6, ITTC Building, Plymouth Science Park, Plymouth PL68BX, UK.
| | - Michael Allen
- University of Exeter, Medical School, St Lukes Campus, Heavitree Road, SC 2.30, Exeter EX4 4QJ, UK.
| | - Adam Streeter
- University of Plymouth, N15, ITTC1, Plymouth Science Park, Plymouth PL6 8BX, UK.
| | - Emmanuel Ifeachor
- University of Plymouth, N15, ITTC1, Plymouth Science Park, Plymouth PL6 8BX, UK; School of Engineering, Computing and Mathematics, University of Plymouth, Plymouth PL4 8AA, UK.
| | - Martin James
- University of Exeter, Academic Department of Healthcare for Older People, Royal Devon & Exeter Hospital, Exeter EX2 5DW, UK.
| | - Stephen Mullin
- University of Plymouth, Room N6, ITTC Building, Plymouth Science Park, Plymouth PL68BX, UK.
| |
Collapse
|
2
|
Miceli G, Basso MG, Rizzo G, Pintus C, Cocciola E, Pennacchio AR, Tuttolomondo A. Artificial Intelligence in Acute Ischemic Stroke Subtypes According to Toast Classification: A Comprehensive Narrative Review. Biomedicines 2023; 11:1138. [PMID: 37189756 PMCID: PMC10135701 DOI: 10.3390/biomedicines11041138] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Revised: 03/29/2023] [Accepted: 04/06/2023] [Indexed: 05/17/2023] Open
Abstract
The correct recognition of the etiology of ischemic stroke (IS) allows tempestive interventions in therapy with the aim of treating the cause and preventing a new cerebral ischemic event. Nevertheless, the identification of the cause is often challenging and is based on clinical features and data obtained by imaging techniques and other diagnostic exams. TOAST classification system describes the different etiologies of ischemic stroke and includes five subtypes: LAAS (large-artery atherosclerosis), CEI (cardio embolism), SVD (small vessel disease), ODE (stroke of other determined etiology), and UDE (stroke of undetermined etiology). AI models, providing computational methodologies for quantitative and objective evaluations, seem to increase the sensitivity of main IS causes, such as tomographic diagnosis of carotid stenosis, electrocardiographic recognition of atrial fibrillation, and identification of small vessel disease in magnetic resonance images. The aim of this review is to provide overall knowledge about the most effective AI models used in the differential diagnosis of ischemic stroke etiology according to the TOAST classification. According to our results, AI has proven to be a useful tool for identifying predictive factors capable of subtyping acute stroke patients in large heterogeneous populations and, in particular, clarifying the etiology of UDE IS especially detecting cardioembolic sources.
Collapse
Affiliation(s)
- Giuseppe Miceli
- Department of Health Promotion, Mother and Child Care, Internal Medicine and Medical Specialties (ProMISE), Università Degli Studi di Palermo, Piazza Delle Cliniche 2, 90127 Palermo, Italy
- Internal Medicine and Stroke Care Ward, University Hospital, Policlinico “P. Giaccone”, 90141 Palermo, Italy
| | - Maria Grazia Basso
- Department of Health Promotion, Mother and Child Care, Internal Medicine and Medical Specialties (ProMISE), Università Degli Studi di Palermo, Piazza Delle Cliniche 2, 90127 Palermo, Italy
- Internal Medicine and Stroke Care Ward, University Hospital, Policlinico “P. Giaccone”, 90141 Palermo, Italy
| | - Giuliana Rizzo
- Department of Health Promotion, Mother and Child Care, Internal Medicine and Medical Specialties (ProMISE), Università Degli Studi di Palermo, Piazza Delle Cliniche 2, 90127 Palermo, Italy
- Internal Medicine and Stroke Care Ward, University Hospital, Policlinico “P. Giaccone”, 90141 Palermo, Italy
| | - Chiara Pintus
- Department of Health Promotion, Mother and Child Care, Internal Medicine and Medical Specialties (ProMISE), Università Degli Studi di Palermo, Piazza Delle Cliniche 2, 90127 Palermo, Italy
- Internal Medicine and Stroke Care Ward, University Hospital, Policlinico “P. Giaccone”, 90141 Palermo, Italy
| | - Elena Cocciola
- Department of Health Promotion, Mother and Child Care, Internal Medicine and Medical Specialties (ProMISE), Università Degli Studi di Palermo, Piazza Delle Cliniche 2, 90127 Palermo, Italy
- Internal Medicine and Stroke Care Ward, University Hospital, Policlinico “P. Giaccone”, 90141 Palermo, Italy
| | - Andrea Roberta Pennacchio
- Department of Health Promotion, Mother and Child Care, Internal Medicine and Medical Specialties (ProMISE), Università Degli Studi di Palermo, Piazza Delle Cliniche 2, 90127 Palermo, Italy
- Internal Medicine and Stroke Care Ward, University Hospital, Policlinico “P. Giaccone”, 90141 Palermo, Italy
| | - Antonino Tuttolomondo
- Department of Health Promotion, Mother and Child Care, Internal Medicine and Medical Specialties (ProMISE), Università Degli Studi di Palermo, Piazza Delle Cliniche 2, 90127 Palermo, Italy
- Internal Medicine and Stroke Care Ward, University Hospital, Policlinico “P. Giaccone”, 90141 Palermo, Italy
| |
Collapse
|
3
|
Qiu Y, Cheng S, Wu Y, Yan W, Hu S, Chen Y, Xu Y, Chen X, Yang J, Chen X, Zheng H. Development of rapid and effective risk prediction models for stroke in the Chinese population: a cross-sectional study. BMJ Open 2023; 13:e068045. [PMID: 36858471 PMCID: PMC9980356 DOI: 10.1136/bmjopen-2022-068045] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 03/03/2023] Open
Abstract
OBJECTIVES The purpose of this study was to use easily obtained and directly observable clinical features to establish predictive models to identify patients at increased risk of stroke. SETTING AND PARTICIPANTS A total of 46 240 valid records were obtained from 8 research centres and 14 communities in Jiangxi province, China, between February and September 2018. PRIMARY AND SECONDARY OUTCOME MEASURES The area under the receiver operating characteristic curve (AUC), sensitivity, specificity and accuracy were calculated to test the performance of the five models (logistic regression (LR), random forest (RF), decision tree (DT), extreme gradient boosting (XGBoost) and gradient boosting DT). The calibration curve was used to show calibration performance. RESULTS The results indicated that XGBoost (AUC: 0.924, accuracy: 0.873, sensitivity: 0.776, specificity: 0.916) and RF (AUC: 0.924, accuracy: 0.872, sensitivity: 0.778, specificity: 0.913) demonstrated excellent performance in predicting stroke. Physical inactivity, hypertension, meat-based diet and high salt intake were important prediction features of stroke. CONCLUSION The five machine learning models all had good predictive and discriminatory performance for stroke. The performance of RF and XGBoost was slightly better than that of LR, which was easier to interpret and less prone to overfitting. This work provides a rapid and accurate tool for stroke risk assessment, which can help to improve the efficiency of stroke screening medical services and the management of high-risk groups.
Collapse
Affiliation(s)
- Yuexin Qiu
- School of Public Health, Nanchang University, Nanchang, Jiangxi, China
- Key Laboratory of Preventive Medicine, Nanchang University, Nanchang, Jiangxi, China
| | - Shiqi Cheng
- Neurosurgery Department, Nanchang University Second Affiliated Hospital, Nanchang, Jiangxi, China
| | - Yuhang Wu
- Department of Epidemiology and Health Statistics, Central South University, Changsha, Hunan, China
| | - Wei Yan
- Institute of Chronic Non-communicable Diseases, Center for Disease Control and Prevention of Jiangxi Province, Nanchang, Jiangxi, China
| | - Songbo Hu
- School of Public Health, Nanchang University, Nanchang, Jiangxi, China
- Key Laboratory of Preventive Medicine, Nanchang University, Nanchang, Jiangxi, China
| | - Yiying Chen
- Institute of Chronic Non-communicable Diseases, Center for Disease Control and Prevention of Jiangxi Province, Nanchang, Jiangxi, China
| | - Yan Xu
- Institute of Chronic Non-communicable Diseases, Center for Disease Control and Prevention of Jiangxi Province, Nanchang, Jiangxi, China
| | - Xiaona Chen
- Institute of Chronic Non-communicable Diseases, Center for Disease Control and Prevention of Jiangxi Province, Nanchang, Jiangxi, China
| | - Junsai Yang
- School of Public Health, Nanchang University, Nanchang, Jiangxi, China
- Key Laboratory of Preventive Medicine, Nanchang University, Nanchang, Jiangxi, China
| | - Xiaoyun Chen
- School of Public Health, Nanchang University, Nanchang, Jiangxi, China
- Key Laboratory of Preventive Medicine, Nanchang University, Nanchang, Jiangxi, China
| | - Huilie Zheng
- School of Public Health, Nanchang University, Nanchang, Jiangxi, China
- Key Laboratory of Preventive Medicine, Nanchang University, Nanchang, Jiangxi, China
| |
Collapse
|
4
|
Chen M, Tan X, Padman R. A Machine Learning Approach to Support Urgent Stroke Triage Using Administrative Data and Social Determinants of Health at Hospital Presentation: Retrospective Study. J Med Internet Res 2023; 25:e36477. [PMID: 36716097 PMCID: PMC9926350 DOI: 10.2196/36477] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2022] [Revised: 07/17/2022] [Accepted: 12/18/2022] [Indexed: 01/31/2023] Open
Abstract
BACKGROUND The key to effective stroke management is timely diagnosis and triage. Machine learning (ML) methods developed to assist in detecting stroke have focused on interpreting detailed clinical data such as clinical notes and diagnostic imaging results. However, such information may not be readily available when patients are initially triaged, particularly in rural and underserved communities. OBJECTIVE This study aimed to develop an ML stroke prediction algorithm based on data widely available at the time of patients' hospital presentations and assess the added value of social determinants of health (SDoH) in stroke prediction. METHODS We conducted a retrospective study of the emergency department and hospitalization records from 2012 to 2014 from all the acute care hospitals in the state of Florida, merged with the SDoH data from the American Community Survey. A case-control design was adopted to construct stroke and stroke mimic cohorts. We compared the algorithm performance and feature importance measures of the ML models (ie, gradient boosting machine and random forest) with those of the logistic regression model based on 3 sets of predictors. To provide insights into the prediction and ultimately assist care providers in decision-making, we used TreeSHAP for tree-based ML models to explain the stroke prediction. RESULTS Our analysis included 143,203 hospital visits of unique patients, and it was confirmed based on the principal diagnosis at discharge that 73% (n=104,662) of these patients had a stroke. The approach proposed in this study has high sensitivity and is particularly effective at reducing the misdiagnosis of dangerous stroke chameleons (false-negative rate <4%). ML classifiers consistently outperformed the benchmark logistic regression in all 3 input combinations. We found significant consistency across the models in the features that explain their performance. The most important features are age, the number of chronic conditions on admission, and primary payer (eg, Medicare or private insurance). Although both the individual- and community-level SDoH features helped improve the predictive performance of the models, the inclusion of the individual-level SDoH features led to a much larger improvement (area under the receiver operating characteristic curve increased from 0.694 to 0.823) than the inclusion of the community-level SDoH features (area under the receiver operating characteristic curve increased from 0.823 to 0.829). CONCLUSIONS Using data widely available at the time of patients' hospital presentations, we developed a stroke prediction model with high sensitivity and reasonable specificity. The prediction algorithm uses variables that are routinely collected by providers and payers and might be useful in underresourced hospitals with limited availability of sensitive diagnostic tools or incomplete data-gathering capabilities.
Collapse
Affiliation(s)
- Min Chen
- Department of Information Systems & Business Analytics, College of Business, Florida International University, Miami, FL, United States
| | - Xuan Tan
- Department of Information Systems and Analytics, Leavey School of Business, Santa Clara University, Santa Clara, CA, United States
| | - Rema Padman
- The H John Heinz III College of Information Systems and Public Policy, Carnegie Mellon University, Pittsburgh, PA, United States
| |
Collapse
|
5
|
Akyel A. Accurate estimation of stroke risk with fuzzy clustering and ensemble learning methods. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2022.103764] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
6
|
Artificial Intelligence: A Shifting Paradigm in Cardio-Cerebrovascular Medicine. J Clin Med 2021; 10:jcm10235710. [PMID: 34884412 PMCID: PMC8658222 DOI: 10.3390/jcm10235710] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Accepted: 12/02/2021] [Indexed: 12/21/2022] Open
Abstract
The future of healthcare is an organic blend of technology, innovation, and human connection. As artificial intelligence (AI) is gradually becoming a go-to technology in healthcare to improve efficiency and outcomes, we must understand our limitations. We should realize that our goal is not only to provide faster and more efficient care, but also to deliver an integrated solution to ensure that the care is fair and not biased to a group of sub-population. In this context, the field of cardio-cerebrovascular diseases, which encompasses a wide range of conditions-from heart failure to stroke-has made some advances to provide assistive tools to care providers. This article aimed to provide an overall thematic review of recent development focusing on various AI applications in cardio-cerebrovascular diseases to identify gaps and potential areas of improvement. If well designed, technological engines have the potential to improve healthcare access and equitability while reducing overall costs, diagnostic errors, and disparity in a system that affects patients and providers and strives for efficiency.
Collapse
|
7
|
Zanotto BS, Beck da Silva Etges AP, Dal Bosco A, Cortes EG, Ruschel R, De Souza AC, Andrade CMV, Viegas F, Canuto S, Luiz W, Ouriques Martins S, Vieira R, Polanczyk C, André Gonçalves M. Stroke Outcome Measurements From Electronic Medical Records: Cross-sectional Study on the Effectiveness of Neural and Nonneural Classifiers. JMIR Med Inform 2021; 9:e29120. [PMID: 34723829 PMCID: PMC8593798 DOI: 10.2196/29120] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2021] [Revised: 06/27/2021] [Accepted: 08/05/2021] [Indexed: 01/20/2023] Open
Abstract
BACKGROUND With the rapid adoption of electronic medical records (EMRs), there is an ever-increasing opportunity to collect data and extract knowledge from EMRs to support patient-centered stroke management. OBJECTIVE This study aims to compare the effectiveness of state-of-the-art automatic text classification methods in classifying data to support the prediction of clinical patient outcomes and the extraction of patient characteristics from EMRs. METHODS Our study addressed the computational problems of information extraction and automatic text classification. We identified essential tasks to be considered in an ischemic stroke value-based program. The 30 selected tasks were classified (manually labeled by specialists) according to the following value agenda: tier 1 (achieved health care status), tier 2 (recovery process), care related (clinical management and risk scores), and baseline characteristics. The analyzed data set was retrospectively extracted from the EMRs of patients with stroke from a private Brazilian hospital between 2018 and 2019. A total of 44,206 sentences from free-text medical records in Portuguese were used to train and develop 10 supervised computational machine learning methods, including state-of-the-art neural and nonneural methods, along with ontological rules. As an experimental protocol, we used a 5-fold cross-validation procedure repeated 6 times, along with subject-wise sampling. A heatmap was used to display comparative result analyses according to the best algorithmic effectiveness (F1 score), supported by statistical significance tests. A feature importance analysis was conducted to provide insights into the results. RESULTS The top-performing models were support vector machines trained with lexical and semantic textual features, showing the importance of dealing with noise in EMR textual representations. The support vector machine models produced statistically superior results in 71% (17/24) of tasks, with an F1 score >80% regarding care-related tasks (patient treatment location, fall risk, thrombolytic therapy, and pressure ulcer risk), the process of recovery (ability to feed orally or ambulate and communicate), health care status achieved (mortality), and baseline characteristics (diabetes, obesity, dyslipidemia, and smoking status). Neural methods were largely outperformed by more traditional nonneural methods, given the characteristics of the data set. Ontological rules were also effective in tasks such as baseline characteristics (alcoholism, atrial fibrillation, and coronary artery disease) and the Rankin scale. The complementarity in effectiveness among models suggests that a combination of models could enhance the results and cover more tasks in the future. CONCLUSIONS Advances in information technology capacity are essential for scalability and agility in measuring health status outcomes. This study allowed us to measure effectiveness and identify opportunities for automating the classification of outcomes of specific tasks related to clinical conditions of stroke victims, and thus ultimately assess the possibility of proactively using these machine learning techniques in real-world situations.
Collapse
Affiliation(s)
- Bruna Stella Zanotto
- National Institute of Health Technology Assessment - INCT/IATS (CNPQ 465518/2014-1), Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil.,Graduate Program in Epidemiology, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
| | - Ana Paula Beck da Silva Etges
- National Institute of Health Technology Assessment - INCT/IATS (CNPQ 465518/2014-1), Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil.,School of Technology, Pontifícia Universidade Católica do Rio Grande do Sul, Porto Alegre, Brazil
| | - Avner Dal Bosco
- School of Technology, Pontifícia Universidade Católica do Rio Grande do Sul, Porto Alegre, Brazil
| | - Eduardo Gabriel Cortes
- Graduate Program of Computer Science, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
| | - Renata Ruschel
- National Institute of Health Technology Assessment - INCT/IATS (CNPQ 465518/2014-1), Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
| | | | - Claudio M V Andrade
- Computer Science Department, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil
| | - Felipe Viegas
- Computer Science Department, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil
| | - Sergio Canuto
- Computer Science Department, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil
| | - Washington Luiz
- Computer Science Department, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil
| | | | - Renata Vieira
- Centro Interdisciplinar de História, Culturas e Sociedades (CIDEHUS), Universidade de Évora, Évora, Portugal
| | - Carisi Polanczyk
- National Institute of Health Technology Assessment - INCT/IATS (CNPQ 465518/2014-1), Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil.,Graduate Program in Epidemiology, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
| | - Marcos André Gonçalves
- Computer Science Department, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil
| |
Collapse
|
8
|
A Digital Twins Machine Learning Model for Forecasting Disease Progression in Stroke Patients. APPLIED SCIENCES-BASEL 2021. [DOI: 10.3390/app11125576] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Background: Machine learning methods have been developed to predict the likelihood of a given event or classify patients into two or more diagnostic categories. Digital twin models, which forecast entire trajectories of patient health data, have potential applications in clinical trials and patient management. Methods: In this study, we apply a digital twin model based on a variational autoencoder to a population of patients who went on to experience an ischemic stroke. The digital twin’s ability to model patient clinical features was assessed with regard to its ability to forecast clinical measurement trajectories leading up to the onset of the acute medical event and beyond using International Classification of Diseases (ICD) codes for ischemic stroke and lab values as inputs. Results: The simulated patient trajectories were virtually indistinguishable from real patient data, with similar feature means, standard deviations, inter-feature correlations, and covariance structures on a withheld test set. A logistic regression adversary model was unable to distinguish between the real and simulated data area under the receiver operating characteristic (ROC) curve (AUCadversary = 0.51). Conclusion: Through accurate projection of patient trajectories, this model may help inform clinical decision making or provide virtual control arms for efficient clinical trials.
Collapse
|
9
|
Wang H, Avillach P. Retracted: Diagnostic Classification and Prognostic Prediction Using Common Genetic Variants in Autism Spectrum Disorder: Genotype-Based Deep Learning. JMIR Med Inform 2021; 9:e24754. [PMID: 33714937 PMCID: PMC8060867 DOI: 10.2196/24754] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2020] [Revised: 02/18/2021] [Accepted: 03/14/2021] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND In the United States, about 3 million people have autism spectrum disorder (ASD), and around 1 out of 59 children are diagnosed with ASD. People with ASD have characteristic social communication deficits and repetitive behaviors. The causes of this disorder remain unknown; however, in up to 25% of cases, a genetic cause can be identified. Detecting ASD as early as possible is desirable because early detection of ASD enables timely interventions in children with ASD. Identification of ASD based on objective pathogenic mutation screening is the major first step toward early intervention and effective treatment of affected children. OBJECTIVE Recent investigation interrogated genomics data for detecting and treating autism disorders, in addition to the conventional clinical interview as a diagnostic test. Since deep neural networks perform better than shallow machine learning models on complex and high-dimensional data, in this study, we sought to apply deep learning to genetic data obtained across thousands of simplex families at risk for ASD to identify contributory mutations and to create an advanced diagnostic classifier for autism screening. METHODS After preprocessing the genomics data from the Simons Simplex Collection, we extracted top ranking common variants that may be protective or pathogenic for autism based on a chi-square test. A convolutional neural network-based diagnostic classifier was then designed using the identified significant common variants to predict autism. The performance was then compared with shallow machine learning-based classifiers and randomly selected common variants. RESULTS The selected contributory common variants were significantly enriched in chromosome X while chromosome Y was also discriminatory in determining the identification of autistic individuals from nonautistic individuals. The ARSD, MAGEB16, and MXRA5 genes had the largest effect in the contributory variants. Thus, screening algorithms were adapted to include these common variants. The deep learning model yielded an area under the receiver operating characteristic curve of 0.955 and an accuracy of 88% for identifying autistic individuals from nonautistic individuals. Our classifier demonstrated a considerable improvement of ~13% in terms of classification accuracy compared to standard autism screening tools. CONCLUSIONS Common variants are informative for autism identification. Our findings also suggest that the deep learning process is a reliable method for distinguishing the diseased group from the control group based on the common variants of autism.
Collapse
Affiliation(s)
- Haishuai Wang
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States
- Department of Computer Science and Engineering, Fairfield University, Fairfield, CT, United States
| | - Paul Avillach
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States
| |
Collapse
|
10
|
Wu Y, Chen F, Song H, Feng W, Sun J, Liu R, Li D, Liu Y. Use of a Smartphone Platform to Help With Emergency Management of Acute Ischemic Stroke: Observational Study. JMIR Mhealth Uhealth 2021; 9:e25488. [PMID: 33560236 PMCID: PMC7902188 DOI: 10.2196/25488] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Revised: 12/06/2020] [Accepted: 01/20/2021] [Indexed: 01/01/2023] Open
Abstract
BACKGROUND To improve the outcomes of acute ischemic stroke (AIS), timely thrombolytic therapy is crucial. Series strategies were recommended to reduce door-to-needle (DTN) time for AIS. Mobile technologies are feasible and have been used in stroke management for various purposes. However, the use of smartphone platforms that integrate series strategies through the entire first aid process to improve emergency management of AIS remains to be verified. OBJECTIVE This study aims to describe the utility and application of a smartphone platform in the emergency management of AIS and report the DTN time for patients with AIS during its 2-year application period. Our results are relevant to digital health management. METHODS A smartphone platform named "Green" was developed to incorporate the field assessment, hospital recommendation, prehospital notification, real-time communication, clinical records creation, key time-stamping, and quality control to streamline and standardize overall AIS emergency management processes. The emergency medical system (EMS) and all the emergency departments in Beijing have used this platform since 2018. From January 1, 2018, to December 31, 2019, 8457 patients diagnosed with AIS received intravenous tissue-type plasminogen activator therapy. The median DTN time and the proportions of patients with DTN times of ≤60 minutes and ≤45 minutes were reported. RESULTS During the 2-year application period of this platform, the median DTN time was 45 minutes, and the proportions of patients with DTN times of ≤60 minutes and ≤45 minutes were 74.6% and 50.5%, respectively. The median DTN time was significantly reduced from 50 minutes in 2018 to 42 minutes in 2019 (P<.001). The proportions of patients with DTN times of ≤60 minutes and ≤45 minutes increased from 66.1% and 40.7%, respectively, in 2018 to 80.7% and 57.3%, respectively, in 2019 (both P<.001). Sustained improvement in DTN time was seen during all the observed months. The improvement occurred across all facilities, and the variations among hospitals also decreased. The median DTN time for patients transferred by ambulances (43 minutes) was significantly shorter than those who reached hospitals by themselves (47 minutes; P<.001). CONCLUSIONS Sustained reductions in DTN time reflected the improvement in AIS emergency management processes. The use of a smartphone platform integrating recommended strategies throughout all first aid stages is a practical way to help the emergency management of AIS.
Collapse
Affiliation(s)
- Yiqun Wu
- Department of Epidemiology and Biostatistics, School of Public Health, Peking University Health Science Center, Peking, China
| | - Fei Chen
- Department of Neurology, Xuanwu Hospital, Capital Medical University, Beijing, China
| | - Haiqing Song
- Department of Neurology, Xuanwu Hospital, Capital Medical University, Beijing, China
| | - Wuwei Feng
- Department of Neurology, Duke University School of Medicine, Durham, China
| | - Jinping Sun
- Department of Neurology, The Affiliated Hospital of Qingdao University, Qingdao, China
| | - Ruisen Liu
- Beijing Municipal Health Commission, Beijing, China
| | - Dongmei Li
- BEIJING ANMED Medical Technology Co Ltd, Beijing, China
| | - Ying Liu
- Beijing Municipal Health Commission, Beijing, China
| |
Collapse
|
11
|
Evaluation of ECG Features for the Classification of Post-Stroke Survivors with a Diagnostic Approach. APPLIED SCIENCES-BASEL 2020. [DOI: 10.3390/app11010192] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Stroke is considered as a major cause of death and neurological disorders commonly associated with elderly people. Electrocardiogram (ECG) signals are used as a powerful tool in diagnosing stroke, and the analysis of ECG signals has become the focus of stroke research. ECG changes and autonomic dysfunction are reportedly seen in patients with stroke. This study aimed to analyze the ECG features and develop a classification model with highly ranked ECG features as input variables based on machine-learning techniques for diagnosing stroke disease. The study included 52 stroke patients (mean age 72.7 years, 63% male) and 80 control subjects (mean age 75.5 years, 39% male) for a total of 132 elderly subjects. Resting ECG signals in the lying down position are measured using the BIOPAC MP150 system. The ECG signals are denoised using the discrete wavelet transform (DWT) method, and the features such as heart rate variability (HRV), indices of time and spectral domains and statistical and impulsive metrics, in addition to fiducial features, are extracted and analyzed. Our results showed that the values of the HRV variables were lower in the stroke group, revealing autonomic dysfunction in stroke patients. A statistically significant difference was observed in low-frequency (LF)/high-frequency (HF), time interval measured after the S wave to the beginning of the T wave (ST) and time interval measured from the beginning of the Q wave to the end of the T wave (QT) (p < 0.05) between the groups. Our study also highlighted some of the risk factors of stroke, such as age, male sex and dyslipidemia (p < 0.05), that are statistically significant. The k-nearest neighbors (KNN) model showed the highest classification results (accuracy 96.6%, precision 94.3%, recall 99.1% and F1-score 96.6%) than the random forest, support vector machine (SVM), Naïve Bayes and logistic regression models. Thus, our study reported some of the notable ECG changes in the study participants and also indicated that ECG could aid in diagnosing stroke disease.
Collapse
|
12
|
Li X, Bian D, Yu J, Li M, Zhao D. Using machine learning models to improve stroke risk level classification methods of China national stroke screening. BMC Med Inform Decis Mak 2019; 19:261. [PMID: 31822270 PMCID: PMC6902572 DOI: 10.1186/s12911-019-0998-2] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2019] [Accepted: 11/25/2019] [Indexed: 12/20/2022] Open
Abstract
Background With the character of high incidence, high prevalence and high mortality, stroke has brought a heavy burden to families and society in China. In 2009, the Ministry of Health of China launched the China national stroke screening and intervention program, which screens stroke and its risk factors and conducts high-risk population interventions for people aged above 40 years old all over China. In this program, stroke risk factors include hypertension, diabetes, dyslipidemia, smoking, lack of exercise, apparently overweight and family history of stroke. People with more than two risk factors or history of stroke or transient ischemic attack (TIA) are considered as high-risk. However, it is impossible for this criterion to classify stroke risk levels for people with unknown values in fields of risk factors. The missing of stroke risk levels results in reduced efficiency of stroke interventions and inaccuracies in statistical results at the national level. In this paper, we use 2017 national stroke screening data to develop stroke risk classification models based on machine learning algorithms to improve the classification efficiency. Method Firstly, we construct training set and test sets and process the imbalance training set based on oversampling and undersampling method. Then, we develop logistic regression model, Naïve Bayesian model, Bayesian network model, decision tree model, neural network model, random forest model, bagged decision tree model, voting model and boosting model with decision trees to classify stroke risk levels. Result The recall of the boosting model with decision trees is the highest (99.94%), and the precision of the model based on the random forest is highest (97.33%). Using the random forest model (recall: 98.44%), the recall will be increased by about 2.8% compared with the method currently used, and several thousands more people with high risk of stroke can be identified each year. Conclusion Models developed in this paper can improve the current screening method in the way that it can avoid the impact of unknown values, and avoid unnecessary rescreening and intervention expenditures. The national stroke screening program can choose classification models according to the practice need.
Collapse
Affiliation(s)
- Xuemeng Li
- Information Center, Academy of Military Medical Sciences, Beijing, China
| | - Di Bian
- School of Electrical and Control Engineering, Xi'an University of Science and Technology
- , Xi'an, China
| | - Jinghui Yu
- Information Center, Academy of Military Medical Sciences, Beijing, China
| | - Mei Li
- China Stroke Data Center, Beijing, China
| | - Dongsheng Zhao
- Information Center, Academy of Military Medical Sciences, Beijing, China.
| |
Collapse
|