1
|
Khudri MM, Rhee KK, Hasan MS, Ahsan KZ. Predicting nutritional status for women of childbearing age from their economic, health, and demographic features: A supervised machine learning approach. PLoS One 2023; 18:e0277738. [PMID: 37172042 PMCID: PMC10180666 DOI: 10.1371/journal.pone.0277738] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Accepted: 05/02/2023] [Indexed: 05/14/2023] Open
Abstract
BACKGROUND Malnutrition imposes enormous costs resulting from lost investments in human capital and increased healthcare expenditures. There is a dearth of research focusing on the prediction of women's body mass index (BMI) and malnutrition outcomes (underweight, overweight, and obesity) in developing countries. This paper attempts to fill out this knowledge gap by predicting the BMI and the risks of malnutrition outcomes for Bangladeshi women of childbearing age from their economic, health, and demographic features. METHODS Data from the 2017-18 Bangladesh Demographic and Health Survey and a series of supervised machine learning (SML) techniques are used. Additionally, this study circumvents the imbalanced distribution problem in obesity classification by utilizing an oversampling approach. RESULTS Study findings demonstrate that the support vector machine and k-nearest neighbor are the two best-performing methods in BMI prediction based on the coefficient of determination (R2), root mean square error (RMSE), and mean absolute error (MAE). The combined predictor algorithms consistently yield top specificity, Cohen's kappa, F1-score, and AUC in classifying the malnutrition status, and their performance is robust to alternative standards. The feature importance ranking based on several nonparametric and combined predictors indicates that socioeconomic status, women's age, and breastfeeding status are the most important features in predicting women's nutritional outcomes. Furthermore, the conditional inference trees corroborate that those three features, along with the partner's educational attainment and employment status, significantly predict malnutrition risks. CONCLUSION To the best of our knowledge, this is the first study that predicts BMI and one of the pioneer studies to classify all three malnutrition outcomes for women of childbearing age in Bangladesh, let alone in any lower-middle income country, using SML techniques. Moreover, in the context of Bangladesh, this paper is the first to identify and rank features that are critical in predicting nutritional outcomes using several feature selection algorithms. The estimators from this study predict the outcomes of interest most accurately and efficiently compared to other existing studies in the relevant literature. Therefore, study findings can aid policymakers in designing policy and programmatic approaches to address the double burden of malnutrition among Bangladeshi women, thereby reducing the country's economic burden.
Collapse
Affiliation(s)
- Md Mohsan Khudri
- Department of Economics, Fogelman College of Business and Economics, The University of Memphis, Memphis, Tennessee, United States of America
| | - Kang Keun Rhee
- Department of Economics, Fogelman College of Business and Economics, The University of Memphis, Memphis, Tennessee, United States of America
| | | | - Karar Zunaid Ahsan
- Public Health Leadership Program, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| |
Collapse
|
2
|
Mavragani A, Yamaguchi M, Nishi N, Araki M, Wee LH. Predicting Overweight and Obesity Status Among Malaysian Working Adults With Machine Learning or Logistic Regression: Retrospective Comparison Study. JMIR Form Res 2022; 6:e40404. [PMID: 36476813 PMCID: PMC9773027 DOI: 10.2196/40404] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 10/09/2022] [Accepted: 10/11/2022] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND Overweight or obesity is a primary health concern that leads to a significant burden of noncommunicable disease and threatens national productivity and economic growth. Given the complexity of the etiology of overweight or obesity, machine learning (ML) algorithms offer a promising alternative approach in disentangling interdependent factors for predicting overweight or obesity status. OBJECTIVE This study examined the performance of 3 ML algorithms in comparison with logistic regression (LR) to predict overweight or obesity status among working adults in Malaysia. METHODS Using data from 16,860 participants (mean age 34.2, SD 9.0 years; n=6904, 41% male; n=7048, 41.8% with overweight or obesity) in the Malaysia's Healthiest Workplace by AIA Vitality 2019 survey, predictor variables, including sociodemographic characteristics, job characteristics, health and weight perceptions, and lifestyle-related factors, were modeled using the extreme gradient boosting (XGBoost), random forest (RF), and support vector machine (SVM) algorithms, as well as LR, to predict overweight or obesity status based on a BMI cutoff of 25 kg/m2. RESULTS The area under the receiver operating characteristic curve was 0.81 (95% CI 0.79-0.82), 0.80 (95% CI 0.79-0.81), 0.80 (95% CI 0.78-0.81), and 0.78 (95% CI 0.77-0.80) for the XGBoost, RF, SVM, and LR models, respectively. Weight satisfaction was the top predictor, and ethnicity, age, and gender were also consistent predictor variables of overweight or obesity status in all models. CONCLUSIONS Based on multi-domain online workplace survey data, this study produced predictive models that identified overweight or obesity status with moderate to high accuracy. The performance of both ML-based and logistic regression models were comparable when predicting obesity among working adults in Malaysia.
Collapse
Affiliation(s)
| | - Miwa Yamaguchi
- National Institute of Health and Nutrition, National Institutes of Biomedical Innovation, Health and Nutrition, Tokyo, Japan
| | - Nobuo Nishi
- National Institute of Health and Nutrition, National Institutes of Biomedical Innovation, Health and Nutrition, Tokyo, Japan
| | - Michihiro Araki
- National Institute of Health and Nutrition, National Institutes of Biomedical Innovation, Health and Nutrition, Tokyo, Japan
| | - Lei Hum Wee
- Centre for Community Health Studies, Faculty of Health Sciences, Universiti Kebangsaan Malaysia, Kuala Lumpur, Malaysia.,Faculty of Health and Medical Sciences, School of Medicine, Taylor's University, Selangor, Malaysia
| |
Collapse
|
3
|
Ali S, Na R, Waterhouse M, Jordan SJ, Olsen CM, Whiteman DC, Neale RE. Predicting obesity and smoking using medication data: A machine-learning approach. Pharmacoepidemiol Drug Saf 2021; 31:91-99. [PMID: 34611961 DOI: 10.1002/pds.5367] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Revised: 09/29/2021] [Accepted: 10/01/2021] [Indexed: 12/23/2022]
Abstract
PURPOSE Administrative health datasets are widely used in public health research but often lack information about common confounders. We aimed to develop and validate machine learning (ML)-based models using medication data from Australia's Pharmaceutical Benefits Scheme (PBS) database to predict obesity and smoking. METHODS We used data from the D-Health Trial (N = 18 000) and the QSkin Study (N = 43 794). Smoking history, and height and weight were self-reported at study entry. Linkage to the PBS dataset captured 5 years of medication data after cohort entry. We used age, sex, and medication use, classified using anatomical therapeutic classification codes, as potential predictors of smoking (current or quit <10 years ago; never or quit ≥10 years ago) and obesity (obese; non-obese). We trained gradient-boosted machine learning models using data for the first 80% of participants enrolled; models were validated using the remaining 20%. We assessed model performance overall and by sex and age, and compared models generated using 3 and 5 years of PBS data. RESULTS Based on the validation dataset using 3 years of PBS data, the area under the receiver operating characteristic curve was 0.70 (95% confidence interval [CI] 0.68-0.71) for predicting obesity and 0.71 (95% CI 0.70-0.72) for predicting smoking. Models performed better in women than in men. Using 5 years of PBS data resulted in marginal improvement. CONCLUSIONS Medication data in combination with age and sex can be used to predict obesity and smoking. These models may be of value to researchers using data collected for administrative purposes.
Collapse
Affiliation(s)
- Sitwat Ali
- Population Health Department, QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia.,School of Population Health, University of Queensland, Brisbane, Queensland, Australia
| | - Renhua Na
- Population Health Department, QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia
| | - Mary Waterhouse
- Population Health Department, QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia
| | - Susan J Jordan
- Population Health Department, QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia.,School of Population Health, University of Queensland, Brisbane, Queensland, Australia
| | - Catherine M Olsen
- Population Health Department, QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia.,Faculty of Medicine, University of Queensland, Brisbane, Queensland, Australia
| | - David C Whiteman
- Population Health Department, QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia
| | - Rachel E Neale
- Population Health Department, QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia.,School of Population Health, University of Queensland, Brisbane, Queensland, Australia
| |
Collapse
|
4
|
Delnevo G, Mancini G, Roccetti M, Salomoni P, Trombini E, Andrei F. The Prediction of Body Mass Index from Negative Affectivity through Machine Learning: A Confirmatory Study. Sensors (Basel) 2021; 21:2361. [PMID: 33805257 PMCID: PMC8037317 DOI: 10.3390/s21072361] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/17/2021] [Revised: 03/17/2021] [Accepted: 03/26/2021] [Indexed: 11/16/2022]
Abstract
This study investigates on the relationship between affect-related psychological variables and Body Mass Index (BMI). We have utilized a novel method based on machine learning (ML) algorithms that forecast unobserved BMI values based on psychological variables, like depression, as predictors. We have employed various machine learning algorithms, including gradient boosting and random forest, with psychological variables relative to 221 subjects to predict both the BMI values and the BMI status (normal, overweight, and obese) of those subjects. We have found that the psychological variables in use allow one to predict both the BMI values (with a mean absolute error of 5.27-5.50) and the BMI status with an accuracy of over 80% (metric: F1-score). Further, our study has also confirmed the particular efficacy of psychological variables of negative type, such as depression for example, compared to positive ones, to achieve excellent predictive BMI values.
Collapse
Affiliation(s)
- Giovanni Delnevo
- Department of Computer Science and Engineering, University of Bologna, 40127 Bologna, Italy; (G.D.); (P.S.)
| | - Giacomo Mancini
- Department of Education, University of Bologna, 40127 Bologna, Italy;
| | - Marco Roccetti
- Department of Computer Science and Engineering, University of Bologna, 40127 Bologna, Italy; (G.D.); (P.S.)
| | - Paola Salomoni
- Department of Computer Science and Engineering, University of Bologna, 40127 Bologna, Italy; (G.D.); (P.S.)
| | - Elena Trombini
- Department of Psychology, University of Bologna, 40127 Bologna, Italy;
| | - Federica Andrei
- Department of Psychology, University of Bologna, 40127 Bologna, Italy;
| |
Collapse
|
5
|
Sirsat MS, Fermé E, Câmara J. Machine Learning for Brain Stroke: A Review. J Stroke Cerebrovasc Dis 2020; 29:105162. [DOI: 10.1016/j.jstrokecerebrovasdis.2020.105162] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2020] [Revised: 07/08/2020] [Accepted: 07/11/2020] [Indexed: 12/29/2022] Open
|
6
|
DeGregory KW, Kuiper P, DeSilvio T, Pleuss JD, Miller R, Roginski JW, Fisher CB, Harness D, Viswanath S, Heymsfield SB, Dungan I, Thomas DM. A review of machine learning in obesity. Obes Rev 2018; 19:668-685. [PMID: 29426065 PMCID: PMC8176949 DOI: 10.1111/obr.12667] [Citation(s) in RCA: 101] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/22/2017] [Revised: 11/18/2017] [Accepted: 11/28/2017] [Indexed: 12/15/2022]
Abstract
Rich sources of obesity-related data arising from sensors, smartphone apps, electronic medical health records and insurance data can bring new insights for understanding, preventing and treating obesity. For such large datasets, machine learning provides sophisticated and elegant tools to describe, classify and predict obesity-related risks and outcomes. Here, we review machine learning methods that predict and/or classify such as linear and logistic regression, artificial neural networks, deep learning and decision tree analysis. We also review methods that describe and characterize data such as cluster analysis, principal component analysis, network science and topological data analysis. We introduce each method with a high-level overview followed by examples of successful applications. The algorithms were then applied to National Health and Nutrition Examination Survey to demonstrate methodology, utility and outcomes. The strengths and limitations of each method were also evaluated. This summary of machine learning algorithms provides a unique overview of the state of data analysis applied specifically to obesity.
Collapse
Affiliation(s)
- K W DeGregory
- Department of Mathematical Sciences, United States Military Academy, West Point, NY, USA
| | - P Kuiper
- Department of Mathematical Sciences, United States Military Academy, West Point, NY, USA
| | - T DeSilvio
- Case Western Reserve University, Cleveland, OH, USA
| | - J D Pleuss
- Department of Mathematical Sciences, United States Military Academy, West Point, NY, USA
| | - R Miller
- Department of Mathematical Sciences, United States Military Academy, West Point, NY, USA
| | - J W Roginski
- Department of Mathematical Sciences, United States Military Academy, West Point, NY, USA
| | - C B Fisher
- Department of Mathematical Sciences, United States Military Academy, West Point, NY, USA
| | - D Harness
- Department of Mathematical Sciences, United States Military Academy, West Point, NY, USA
| | - S Viswanath
- Case Western Reserve University, Cleveland, OH, USA
| | - S B Heymsfield
- Pennington Biomedical Research Center, Baton Rouge, LA, USA
| | - I Dungan
- Department of Mathematical Sciences, United States Military Academy, West Point, NY, USA
| | - D M Thomas
- Department of Mathematical Sciences, United States Military Academy, West Point, NY, USA
| |
Collapse
|
7
|
Lee BJ, Jeon YJ, Kim JY. Association of obesity with anatomical and physical indices related to the radial artery in Korean adults. Eur J Integr Med 2017. [DOI: 10.1016/j.eujim.2017.08.007] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
8
|
Cielo CA, Pascotini FDS, Haeffner LSB, Ribeiro VV, Christmann MK. Tempo máximo fonatório de /e/ e /ė/ não-vozeado e sua relação com índice de massa corporal e sexo em crianças. Rev CEFAC 2016. [DOI: 10.1590/1982-021620161825915] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
RESUMO Objetivo: caracterizar e associar tempo máximo fonatório do /e/ vozeado e de /e/ não vozeado (/ė/), índice de massa corporal e sexo em crianças. Métodos: estudo transversal observacional analítico de campo e quantitativo do qual participaram 102 crianças com idades entre oito e 12 anos (média de 9,66 anos), sendo 53 (51,96%) meninas e 49 (48,04%) meninos. Os sujeitos passaram por triagem auditiva, avaliação antropométrica e coleta dos tempos máximos fonatórios de /e/ e /ė/. Os dados foram analisados por meio dos testes não-paramétricos Mann-Whitney e Correlação de Spearman, com nível de significância de 5%. Resultados: não houve diferença dos tempos máximos fonatórios de /e/, /ė/ e relação ė/e em função do índice de massa corporal e faixa etária, porém crianças do sexo masculino apresentaram tempo máximo fonatório de /e/ significantemente maior do que as meninas. Não foi encontrada correlação entre tempo máximo fonatório e índice de massa corporal. Conclusão: não houve diferença entre tempo máximo fonatório de /ė/, /e/ e relação ė/e, conforme faixa etária e índice de massa corporal, bem como o índice de massa corporal e os tempos máximos fonatórios não se correlacionaram, evidenciando homogeneidade entre as medidas dentro do grupo, sem influência do índice de massa corporal sobre os tempos máximos fonatórios. Em relação ao sexo, os meninos apresentaram tempo máximo fonatório de /e/ maior do que as meninas e apenas as crianças de oito anos apresentaram os TMF tempo máximo fonatório dentro do esperado.
Collapse
|
9
|
Chen H, Yang B, Liu D, Liu W, Liu Y, Zhang X, Hu L. Using Blood Indexes to Predict Overweight Statuses: An Extreme Learning Machine-Based Approach. PLoS One 2015; 10:e0143003. [PMID: 26600199 DOI: 10.1371/journal.pone.0143003] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2015] [Accepted: 10/29/2015] [Indexed: 11/25/2022] Open
Abstract
The number of the overweight people continues to rise across the world. Studies have shown that being overweight can increase health risks, such as high blood pressure, diabetes mellitus, coronary heart disease, and certain forms of cancer. Therefore, identifying the overweight status in people is critical to prevent and decrease health risks. This study explores a new technique that uses blood and biochemical measurements to recognize the overweight condition. A new machine learning technique, an extreme learning machine, was developed to accurately detect the overweight status from a pool of 225 overweight and 251 healthy subjects. The group included 179 males and 297 females. The detection method was rigorously evaluated against the real-life dataset for accuracy, sensitivity, specificity, and AUC (area under the receiver operating characteristic (ROC) curve) criterion. Additionally, the feature selection was investigated to identify correlating factors for the overweight status. The results demonstrate that there are significant differences in blood and biochemical indexes between healthy and overweight people (p-value < 0.01). According to the feature selection, the most important correlated indexes are creatinine, hemoglobin, hematokrit, uric Acid, red blood cells, high density lipoprotein, alanine transaminase, triglyceride, and γ-glutamyl transpeptidase. These are consistent with the results of Spearman test analysis. The proposed method holds promise as a new, accurate method for identifying the overweight status in subjects.
Collapse
|
10
|
|
11
|
Bashshur RL, Shannon GW, Smith BR, Woodward MA. The empirical evidence for the telemedicine intervention in diabetes management. Telemed J E Health 2015; 21:321-54. [PMID: 25806910 PMCID: PMC4432488 DOI: 10.1089/tmj.2015.0029] [Citation(s) in RCA: 50] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2015] [Accepted: 02/17/2015] [Indexed: 12/30/2022] Open
Abstract
OBJECTIVE The research presented here assesses the scientific evidence for the telemedicine intervention in the management of diabetes (telediabetes), gestational diabetes, and diabetic retinopathy. The impetus derives from the confluence of high prevalence of these diseases, increasing incidence, and rising costs, while telemedicine promises to ameliorate, if not prevent, type 2 diabetes and its complications. MATERIALS AND METHODS A purposeful review of the literature identified relevant publications from January 2005 to December 2013. These were culled to retain only credible research articles for detailed review and analysis. The search yielded approximately 17,000 articles with no date constraints. Of these, 770 appeared to be research articles within our time frame. A review of the abstracts yielded 73 articles that met the criteria for inclusion in the final analysis. Evidence is organized by research findings regarding feasibility/acceptance, intermediate outcomes (e.g., use of service, and screening compliance), and health outcomes (control of glycemic level, lipids, body weight, and physical activity.) RESULTS Definitions of telediabetes varied from study to study vis-à-vis diabetes subtype, setting, technology, staffing, duration, frequency, and target population. Outcome measures also varied. Despite these vagaries, sufficient evidence was obtained from a wide variety of research studies, consistently pointing to positive effects of telemonitoring and telescreening in terms of glycemic control, reduced body weight, and increased physical exercise. The major contributions point to telemedicine's potential for changing behaviors important to diabetes control and prevention, especially type 2 and gestational diabetes. Similarly, screening and monitoring for retinopathy can detect symptoms early that may be controlled or treated. CONCLUSIONS Overall, there is strong and consistent evidence of improved glycemic control among persons with type 2 and gestational diabetes as well as effective screening and monitoring of diabetic retinopathy.
Collapse
Affiliation(s)
- Rashid L. Bashshur
- E-Health Center, University of Michigan Health System, Ann Arbor, Michigan
| | - Gary W. Shannon
- Department of Geography, University of Kentucky, Lexington, Kentucky
| | - Brian R. Smith
- E-Health Center, University of Michigan Health System, Ann Arbor, Michigan
| | - Maria A. Woodward
- Departments of Ophthalmology and Visual Sciences, University of Michigan, Ann Arbor, Michigan
| |
Collapse
|