1
|
Begum N, Rahman MM, Omar Faruk M. Machine learning prediction of nutritional status among pregnant women in Bangladesh: Evidence from Bangladesh demographic and health survey 2017-18. PLoS One 2024; 19:e0304389. [PMID: 38820295 PMCID: PMC11142495 DOI: 10.1371/journal.pone.0304389] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Accepted: 05/12/2024] [Indexed: 06/02/2024] Open
Abstract
AIM Malnutrition in pregnant women significantly affects both mother and child health. This research aims to identify the best machine learning (ML) techniques for predicting the nutritional status of pregnant women in Bangladesh and detect the most essential features based on the best-performed algorithm. METHODS This study used retrospective cross-sectional data from the Bangladeshi Demographic and Health Survey 2017-18. Different feature transformations and machine learning classifiers were applied to find the best transformation and classification model. RESULTS This investigation found that robust scaling outperformed all feature transformation methods. The result shows that the Random Forest algorithm with robust scaling outperforms all other machine learning algorithms with 74.75% accuracy, 57.91% kappa statistics, 73.36% precision, 73.08% recall, and 73.09% f1 score. In addition, the Random Forest algorithm had the highest precision (76.76%) and f1 score (71.71%) for predicting the underweight class, as well as an expected precision of 82.01% and f1 score of 83.78% for the overweight/obese class when compared to other algorithms with a robust scaling method. The respondent's age, wealth index, region, husband's education level, husband's age, and occupation were crucial features for predicting the nutritional status of pregnant women in Bangladesh. CONCLUSION The proposed classifier could help predict the expected outcome and reduce the burden of malnutrition among pregnant women in Bangladesh.
Collapse
Affiliation(s)
- Najma Begum
- Department of Statistics, Noakhali Science and Technology University, Noakhali, Bangladesh
| | | | - Mohammad Omar Faruk
- Department of Statistics, Noakhali Science and Technology University, Noakhali, Bangladesh
| |
Collapse
|
2
|
Sewpaul R, Awe OO, Dogbey DM, Sekgala MD, Dukhi N. Classification of Obesity among South African Female Adolescents: Comparative Analysis of Logistic Regression and Random Forest Algorithms. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2023; 21:2. [PMID: 38276791 PMCID: PMC10815679 DOI: 10.3390/ijerph21010002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Revised: 12/14/2023] [Accepted: 12/15/2023] [Indexed: 01/27/2024]
Abstract
BACKGROUND This study evaluates the performance of logistic regression (LR) and random forest (RF) algorithms to model obesity among female adolescents in South Africa. METHODS Data was analysed on 375 females aged 15-17 from the South African National Health and Nutrition Examination Survey 2011/2012. The primary outcome was obesity, defined as body mass index (BMI) ≥ 30 kg/m2. A total of 31 explanatory variables were included, ranging from socio-economic, demographic, family history, dietary and health behaviour. RF and LR models were run using imbalanced data as well as after oversampling, undersampling, and hybrid sampling of the data. RESULTS Using the imbalanced data, the RF model performed better with higher precision, recall, F1 score, and balanced accuracy. Balanced accuracy was highest with the hybrid data (0.618 for RF and 0.668 for LR). Using the hybrid balanced data, the RF model performed better (F1-score = 0.940 for RF vs. 0.798 for LR). CONCLUSION The model with the highest overall performance metrics was the RF model both before balancing the data and after applying hybrid balancing. Future work would benefit from using larger datasets on adolescent female obesity to assess the robustness of the models.
Collapse
Affiliation(s)
- Ronel Sewpaul
- Public Health, Societies and Belonging, Human Sciences Research Council, Merchant House, 2 Dock Rail Road, Cape Town 8001, South Africa
| | - Olushina Olawale Awe
- Institute of Mathematics, Statistics and Scientific Computing (IMECC), University of Campinas, Campinas 13083-859, Brazil
| | - Dennis Makafui Dogbey
- Medical Biotechnology and Immunotherapy Research Unit, Institute of Infectious Diseases and Molecular Medicine, Faculty of Health Sciences, University of Cape Town, Cape Town 7700, South Africa
| | | | - Natisha Dukhi
- Public Health, Societies and Belonging, Human Sciences Research Council, Merchant House, 2 Dock Rail Road, Cape Town 8001, South Africa
| |
Collapse
|
3
|
Solomon DD, Khan S, Garg S, Gupta G, Almjally A, Alabduallah BI, Alsagri HS, Ibrahim MM, Abdallah AMA. Hybrid Majority Voting: Prediction and Classification Model for Obesity. Diagnostics (Basel) 2023; 13:2610. [PMID: 37568973 PMCID: PMC10417773 DOI: 10.3390/diagnostics13152610] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2023] [Revised: 07/26/2023] [Accepted: 07/31/2023] [Indexed: 08/13/2023] Open
Abstract
Because it is associated with most multifactorial inherited diseases like heart disease, hypertension, diabetes, and other serious medical conditions, obesity is a major global health concern. Obesity is caused by hereditary, physiological, and environmental factors, as well as poor nutrition and a lack of exercise. Weight loss can be difficult for various reasons, and it is diagnosed via BMI, which is used to estimate body fat for most people. Muscular athletes, for example, may have a BMI in the obesity range even when they are not obese. Researchers from a variety of backgrounds and institutions devised different hypotheses and models for the prediction and classification of obesity using different approaches and various machine learning techniques. In this study, a majority voting-based hybrid modeling approach using a gradient boosting classifier, extreme gradient boosting, and a multilayer perceptron was developed. Seven distinct machine learning algorithms were used on open datasets from the UCI machine learning repository, and their respective accuracy levels were compared before the combined approaches were chosen. The proposed majority voting-based hybrid model for prediction and classification of obesity that was achieved has an accuracy of 97.16%, which is greater than both the individual models and the other hybrid models that have been developed.
Collapse
Affiliation(s)
- Dahlak Daniel Solomon
- Yogananda School of AI Computers and Data Sciences, Shoolini University, Solan 173229, India
| | - Shakir Khan
- College of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh 11432, Saudi Arabia (A.M.A.A.)
- Department of Computer Science and Engineering, University Centre for Research and Development, Chandigarh University, Mohali 140413, India
| | - Sonia Garg
- Yogananda School of AI Computers and Data Sciences, Shoolini University, Solan 173229, India
| | - Gaurav Gupta
- Yogananda School of AI Computers and Data Sciences, Shoolini University, Solan 173229, India
| | - Abrar Almjally
- College of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh 11432, Saudi Arabia (A.M.A.A.)
| | - Bayan Ibrahimm Alabduallah
- Department of Information System, College of Computer and Information Sciences, Princess Nourah Bint Abdulrahman University, Riyadh 11432, Saudi Arabia
| | - Hatoon S. Alsagri
- College of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh 11432, Saudi Arabia (A.M.A.A.)
| | - Mandour Mohamed Ibrahim
- College of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh 11432, Saudi Arabia (A.M.A.A.)
| | - Alsadig Mohammed Adam Abdallah
- College of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh 11432, Saudi Arabia (A.M.A.A.)
| |
Collapse
|
4
|
Mondal PK, Foysal KH, Norman BA, Gittner LS. Predicting Childhood Obesity Based on Single and Multiple Well-Child Visit Data Using Machine Learning Classifiers. SENSORS (BASEL, SWITZERLAND) 2023; 23:759. [PMID: 36679555 PMCID: PMC9865403 DOI: 10.3390/s23020759] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/19/2022] [Revised: 01/05/2023] [Accepted: 01/05/2023] [Indexed: 06/17/2023]
Abstract
Childhood obesity is a public health concern in the United States. Consequences of childhood obesity include metabolic disease and heart, lung, kidney, and other health-related comorbidities. Therefore, the early determination of obesity risk is needed and predicting the trend of a child's body mass index (BMI) at an early age is crucial. Early identification of obesity can lead to early prevention. Multiple methods have been tested and evaluated to assess obesity trends in children. Available growth charts help determine a child's current obesity level but do not predict future obesity risk. The present methods of predicting obesity include regression analysis and machine learning-based classifications and risk factor (threshold)-based categorizations based on specific criteria. All the present techniques, especially current machine learning-based methods, require longitudinal data and information on a large number of variables related to a child's growth (e.g., socioeconomic, family-related factors) in order to predict future obesity-risk. In this paper, we propose three different techniques for three different scenarios to predict childhood obesity based on machine learning approaches and apply them to real data. Our proposed methods predict obesity for children at five years of age using the following three data sets: (1) a single well-child visit, (2) multiple well-child visits under the age of two, and (3) multiple random well-child visits under the age of five. Our models are especially important for situations where only the current patient information is available rather than having multiple data points from regular spaced well-child visits. Our models predict obesity using basic information such as birth BMI, gestational age, BMI measures from well-child visits, and gender. Our models can predict a child's obesity category (normal, overweight, or obese) at five years of age with an accuracy of 89%, 77%, and 89%, for the three application scenarios, respectively. Therefore, our proposed models can assist healthcare professionals by acting as a decision support tool to aid in predicting childhood obesity early in order to reduce obesity-related complications, and in turn, improve healthcare.
Collapse
Affiliation(s)
- Pritom Kumar Mondal
- Department of Industrial, Manufacturing & Systems Engineering, Texas Tech University, Lubbock, TX 79409, USA
| | - Kamrul H. Foysal
- Department of Electrical and Computer Engineering, Texas Tech University, Lubbock, TX 79409, USA
| | - Bryan A. Norman
- Department of Industrial, Manufacturing & Systems Engineering, Texas Tech University, Lubbock, TX 79409, USA
| | - Lisaann S. Gittner
- Department of Public Health, Texas Tech University Health Sciences Center, Lubbock, TX 79430, USA
| |
Collapse
|
5
|
Khudri MM, Rhee KK, Hasan MS, Ahsan KZ. Predicting nutritional status for women of childbearing age from their economic, health, and demographic features: A supervised machine learning approach. PLoS One 2023; 18:e0277738. [PMID: 37172042 PMCID: PMC10180666 DOI: 10.1371/journal.pone.0277738] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Accepted: 05/02/2023] [Indexed: 05/14/2023] Open
Abstract
BACKGROUND Malnutrition imposes enormous costs resulting from lost investments in human capital and increased healthcare expenditures. There is a dearth of research focusing on the prediction of women's body mass index (BMI) and malnutrition outcomes (underweight, overweight, and obesity) in developing countries. This paper attempts to fill out this knowledge gap by predicting the BMI and the risks of malnutrition outcomes for Bangladeshi women of childbearing age from their economic, health, and demographic features. METHODS Data from the 2017-18 Bangladesh Demographic and Health Survey and a series of supervised machine learning (SML) techniques are used. Additionally, this study circumvents the imbalanced distribution problem in obesity classification by utilizing an oversampling approach. RESULTS Study findings demonstrate that the support vector machine and k-nearest neighbor are the two best-performing methods in BMI prediction based on the coefficient of determination (R2), root mean square error (RMSE), and mean absolute error (MAE). The combined predictor algorithms consistently yield top specificity, Cohen's kappa, F1-score, and AUC in classifying the malnutrition status, and their performance is robust to alternative standards. The feature importance ranking based on several nonparametric and combined predictors indicates that socioeconomic status, women's age, and breastfeeding status are the most important features in predicting women's nutritional outcomes. Furthermore, the conditional inference trees corroborate that those three features, along with the partner's educational attainment and employment status, significantly predict malnutrition risks. CONCLUSION To the best of our knowledge, this is the first study that predicts BMI and one of the pioneer studies to classify all three malnutrition outcomes for women of childbearing age in Bangladesh, let alone in any lower-middle income country, using SML techniques. Moreover, in the context of Bangladesh, this paper is the first to identify and rank features that are critical in predicting nutritional outcomes using several feature selection algorithms. The estimators from this study predict the outcomes of interest most accurately and efficiently compared to other existing studies in the relevant literature. Therefore, study findings can aid policymakers in designing policy and programmatic approaches to address the double burden of malnutrition among Bangladeshi women, thereby reducing the country's economic burden.
Collapse
Affiliation(s)
- Md Mohsan Khudri
- Department of Economics, Fogelman College of Business and Economics, The University of Memphis, Memphis, Tennessee, United States of America
| | - Kang Keun Rhee
- Department of Economics, Fogelman College of Business and Economics, The University of Memphis, Memphis, Tennessee, United States of America
| | | | - Karar Zunaid Ahsan
- Public Health Leadership Program, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| |
Collapse
|
6
|
An R, Shen J, Xiao Y. Applications of Artificial Intelligence to Obesity Research: Scoping Review of Methodologies. J Med Internet Res 2022; 24:e40589. [PMID: 36476515 PMCID: PMC9856437 DOI: 10.2196/40589] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2022] [Revised: 10/05/2022] [Accepted: 11/01/2022] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Obesity is a leading cause of preventable death worldwide. Artificial intelligence (AI), characterized by machine learning (ML) and deep learning (DL), has become an indispensable tool in obesity research. OBJECTIVE This scoping review aimed to provide researchers and practitioners with an overview of the AI applications to obesity research, familiarize them with popular ML and DL models, and facilitate the adoption of AI applications. METHODS We conducted a scoping review in PubMed and Web of Science on the applications of AI to measure, predict, and treat obesity. We summarized and categorized the AI methodologies used in the hope of identifying synergies, patterns, and trends to inform future investigations. We also provided a high-level, beginner-friendly introduction to the core methodologies to facilitate the dissemination and adoption of various AI techniques. RESULTS We identified 46 studies that used diverse ML and DL models to assess obesity-related outcomes. The studies found AI models helpful in detecting clinically meaningful patterns of obesity or relationships between specific covariates and weight outcomes. The majority (18/22, 82%) of the studies comparing AI models with conventional statistical approaches found that the AI models achieved higher prediction accuracy on test data. Some (5/46, 11%) of the studies comparing the performances of different AI models revealed mixed results, indicating the high contingency of model performance on the data set and task it was applied to. An accelerating trend of adopting state-of-the-art DL models over standard ML models was observed to address challenging computer vision and natural language processing tasks. We concisely introduced the popular ML and DL models and summarized their specific applications in the studies included in the review. CONCLUSIONS This study reviewed AI-related methodologies adopted in the obesity literature, particularly ML and DL models applied to tabular, image, and text data. The review also discussed emerging trends such as multimodal or multitask AI models, synthetic data generation, and human-in-the-loop that may witness increasing applications in obesity research.
Collapse
Affiliation(s)
- Ruopeng An
- Brown School, Washington University in St. Louis, St. Louis, MO, United States
| | - Jing Shen
- Department of Physical Education, China University of Geosciences, Beijing, China
| | - Yunyu Xiao
- Weill Cornell Medical College, Cornell University, Ithaca, NY, United States
| |
Collapse
|
7
|
Bhatia A, Smetana S, Heinz V, Hertzberg J. Modeling obesity in complex food systems: Systematic review. Front Endocrinol (Lausanne) 2022; 13:1027147. [PMID: 36313777 PMCID: PMC9606209 DOI: 10.3389/fendo.2022.1027147] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Accepted: 09/27/2022] [Indexed: 11/20/2022] Open
Abstract
Obesity-related data derived from multiple complex systems spanning media, social, economic, food activity, health records, and infrastructure (sensors, smartphones, etc.) can assist us in understanding the relationship between obesity drivers for more efficient prevention and treatment. Reviewed literature shows a growing adaptation of the machine-learning model in recent years dealing with mechanisms and interventions in social influence, nutritional diet, eating behavior, physical activity, built environment, obesity prevalence prediction, distribution, and healthcare cost-related outcomes of obesity. Most models are designed to reflect through time and space at the individual level in a population, which indicates the need for a macro-level generalized population model. The model should consider all interconnected multi-system drivers to address obesity prevalence and intervention. This paper reviews existing computational models and datasets used to compute obesity outcomes to design a conceptual framework for establishing a macro-level generalized obesity model.
Collapse
Affiliation(s)
- Anita Bhatia
- Food Data Group, German Institute of Food Technologies (DIL e.V.), Quakenbrück, Germany
- Knowledge-Based Systems Research Group, Institute of Computer Science, University of Osnabrück, Osnabrück, Germany
| | - Sergiy Smetana
- Food Data Group, German Institute of Food Technologies (DIL e.V.), Quakenbrück, Germany
| | - Volker Heinz
- Food Data Group, German Institute of Food Technologies (DIL e.V.), Quakenbrück, Germany
| | - Joachim Hertzberg
- Knowledge-Based Systems Research Group, Institute of Computer Science, University of Osnabrück, Osnabrück, Germany
- Plan-Based Robot Control German Research Center for Artificial Intelligence, Osnabrück, Germany
| |
Collapse
|
8
|
Alsareii SA, Shaf A, Ali T, Zafar M, Alamri AM, AlAsmari MY, Irfan M, Awais M. IoT Framework for a Decision-Making System of Obesity and Overweight Extrapolation among Children, Youths, and Adults. Life (Basel) 2022; 12:life12091414. [PMID: 36143450 PMCID: PMC9500775 DOI: 10.3390/life12091414] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2022] [Revised: 08/28/2022] [Accepted: 09/05/2022] [Indexed: 01/16/2023] Open
Abstract
Approximately 30% of the global population is suffering from obesity and being overweight, which is approximately 2.1 billion people worldwide. The ratio is expected to surpass 40% by 2030 if the current balance continues to grow. The global pandemic due to COVID-19 will also impact the predicted obesity rates. It will cause a significant increase in morbidity and mortality worldwide. Multiple chronic diseases are associated with obesity and several threat elements are associated with obesity. Various challenges are involved in the understanding of risk factors and the ratio of obesity. Therefore, diagnosing obesity in its initial stages might significantly increase the patient’s chances of effective treatment. The Internet of Things (IoT) has attained an evolving stage in the development of the contemporary environment of healthcare thanks to advancements in information and communication technologies. Therefore, in this paper, we thoroughly investigated machine learning techniques for making an IoT-enabled system. In the first phase, the proposed system analyzed the performances of random forest (RF), K-nearest neighbor (KNN), support vector machine (SVM), decision tree (DT), logistic regression (LR), and naïve Bayes (NB) algorithms on the obesity dataset. The second phase, on the other hand, introduced an IoT-based framework that adopts a multi-user request system by uploading the data to the cloud for the early diagnosis of obesity. The IoT framework makes the system available to anyone (and everywhere) for precise obesity categorization. This research will help the reader understand the relationships among risk factors with weight changes and their visualizations. Furthermore, it also focuses on how existing datasets can help one study the obesity nature and which classification and regression models perform well in correspondence to others.
Collapse
Affiliation(s)
- Saeed Ali Alsareii
- Department of Surgery, College of Medicine, Najran University Saudi Arabia, Najran 11001, Saudi Arabia
- Correspondence:
| | - Ahmad Shaf
- Department of Computer Science, COMSATS University Islamabad, Sahiwal Campus, Sahiwal 57000, Pakistan
| | - Tariq Ali
- Department of Computer Science, COMSATS University Islamabad, Sahiwal Campus, Sahiwal 57000, Pakistan
| | - Maryam Zafar
- Department of Computer Science, COMSATS University Islamabad, Sahiwal Campus, Sahiwal 57000, Pakistan
| | - Abdulrahman Manaa Alamri
- Department of Surgery, College of Medicine, Najran University Saudi Arabia, Najran 11001, Saudi Arabia
| | - Mansour Yousef AlAsmari
- Department of Surgery, College of Medicine, Najran University Saudi Arabia, Najran 11001, Saudi Arabia
| | - Muhammad Irfan
- Electrical Engineering Department, College of Engineering, Najran University Saudi Arabia, Najran 11001, Saudi Arabia
| | - Muhammad Awais
- Department of Computer Science, Edge Hill University, St Helens Rd, Ormskirk L39 4QP, UK
| |
Collapse
|
9
|
Davies T, Louie JCY, Ndanuko R, Barbieri S, Perez-Concha O, Wu JHY. A Machine Learning Approach to Predict the Added-Sugar Content of Packaged Foods. J Nutr 2022; 152:343-349. [PMID: 34550390 DOI: 10.1093/jn/nxab341] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Revised: 08/17/2021] [Accepted: 09/16/2021] [Indexed: 01/16/2023] Open
Abstract
BACKGROUND Dietary guidelines recommend limiting the intake of added sugars. However, despite the public health importance, most countries have not mandated the labeling of added-sugar content on packaged foods and beverages, making it difficult for consumers to avoid products with added sugar, and limiting the ability of policymakers to identify priority products for intervention. OBJECTIVE The aim was to develop a machine learning approach for the prediction of added-sugar content in packaged products using available nutrient, ingredient, and food category information. METHODS The added-sugar prediction algorithm was developed using k-nearest neighbors (KNN) and packaged food information from the US Label Insight dataset (n = 70,522). A synthetic dataset of Australian packaged products (n = 500) was used to assess validity and generalization. Performance metrics included the coefficient of determination (R2), mean absolute error (MAE), and Spearman rank correlation (ρ). To benchmark the KNN approach, the KNN approach was compared with an existing added-sugar prediction approach that relies on a series of manual steps. RESULTS Compared with the existing added-sugar prediction approach, the KNN approach was similarly apt at explaining variation in added-sugar content (R2 = 0.96 vs. 0.97, respectively) and ranking products from highest to lowest in added-sugar content (ρ = 0.91 vs. 0.93, respectively), while less apt at minimizing absolute deviations between predicted and true values (MAE = 1.68 g vs. 1.26 g per 100 g or 100 mL, respectively). CONCLUSIONS KNN can be used to predict added-sugar content in packaged products with a high degree of validity. Being automated, KNN can easily be applied to large datasets. Such predicted added-sugar levels can be used to monitor the food supply and inform interventions aimed at reducing added-sugar intake.
Collapse
Affiliation(s)
- Tazman Davies
- The George Institute for Global Health, Faculty of Medicine, University of New South Wales, Sydney, New South Wales, Australia
| | - Jimmy Chun Yu Louie
- The George Institute for Global Health, Faculty of Medicine, University of New South Wales, Sydney, New South Wales, Australia.,School of Biological Sciences, Faculty of Science, The University of Hong Kong, Hong Kong Special Administrative Region
| | - Rhoda Ndanuko
- The George Institute for Global Health, Faculty of Medicine, University of New South Wales, Sydney, New South Wales, Australia
| | - Sebastiano Barbieri
- Centre for Big Data Research in Health, University of New South Wales, Sydney, New South Wales, Australia
| | - Oscar Perez-Concha
- Centre for Big Data Research in Health, University of New South Wales, Sydney, New South Wales, Australia
| | - Jason H Y Wu
- The George Institute for Global Health, Faculty of Medicine, University of New South Wales, Sydney, New South Wales, Australia
| |
Collapse
|
10
|
Höskuldsdóttir G, Engström M, Rawshani A, Wallenius V, Lenér F, Fändriks L, Mossberg K, Eliasson B. The BAriatic surgery SUbstitution and nutrition (BASUN) population: a data-driven exploration of predictors for obesity. BMC Endocr Disord 2021; 21:183. [PMID: 34507573 PMCID: PMC8431862 DOI: 10.1186/s12902-021-00849-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/10/2021] [Accepted: 08/18/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The development of obesity is most likely due to a combination of biological and environmental factors some of which might still be unidentified. We used a machine learning technique to examine the relative importance of more than 100 clinical variables as predictors for BMI. METHODS BASUN is a prospective non-randomized cohort study of 971 individuals that received medical or surgical treatment (treatment choice was based on patient's preferences and clinical criteria, not randomization) for obesity in the Västra Götaland county in Sweden between 2015 and 2017 with planned follow-up for 10 years. This study includes demographic data, BMI, blood tests, and questionnaires before obesity treatment that cover three main areas: gastrointestinal symptoms and eating habits, physical activity and quality of life, and psychological health. We used random forest, with conditional variable importance, to study the relative importance of roughly 100 predictors of BMI, covering 15 domains. We quantified the predictive value of each individual predictor, as well as each domain. RESULTS The participants received medical (n = 382) or surgical treatment for obesity (Roux-en-Y gastric bypass, n = 388; sleeve gastrectomy, n = 201). There were minor differences between these groups before treatment with regard to anthropometrics, laboratory measures and results from questionnaires. The 10 individual variables with the strongest predictive value, in order of decreasing strength, were country of birth, marital status, sex, calcium levels, age, levels of TSH and HbA1c, AUDIT score, BE tendencies according to QEWPR, and TG levels. The strongest domains predicting BMI were: Socioeconomic status, Demographics, Biomarkers (notably TSH), Lifestyle/habits, Biomarkers for cardiovascular disease and diabetes, and Potential anxiety and depression. CONCLUSIONS Lifestyle, habits, age, sex and socioeconomic status are some of the strongest predictors for BMI levels. Potential anxiety and / or depression and other characteristics captured using questionnaires have strong predictive value. These results confirm previously suggested associations and advocate prospective studies to examine the value of better characterization of patients eligible for obesity treatment, and consequently to evaluate the treatment effects in groups of patients. TRIAL REGISTRATION March 03, 2015; NCT03152617 .
Collapse
Affiliation(s)
- Gudrún Höskuldsdóttir
- Institute of Medicine, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden.
- Department of Medicine, Sahlgrenska University Hospital, 413 45, Gothenburg, Sweden.
| | - My Engström
- Institute of Health and Care Sciences, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
- Department of Surgery, Region Västra Götaland, Sahlgrenska University Hospital, Gothenburg, Sweden
| | - Araz Rawshani
- Institute of Medicine, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
| | - Ville Wallenius
- Institute of Clinical Sciences, University of Gothenburg, Gothenburg, Sweden
- Department of Surgery, Sahlgrenska University Hospital, Gothenburg, Sweden
| | - Frida Lenér
- Institute of Medicine, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
- Department of Public Health and Community Medicine, Sahlgrenska University Hospital, Gothenburg, Sweden
| | - Lars Fändriks
- Institute of Clinical Sciences, University of Gothenburg, Gothenburg, Sweden
- Department of Surgery, Sahlgrenska University Hospital, Gothenburg, Sweden
| | - Karin Mossberg
- Institute of Medicine, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
- Department of Public Health and Community Medicine, Sahlgrenska University Hospital, Gothenburg, Sweden
| | - Björn Eliasson
- Institute of Medicine, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
- Department of Medicine, Sahlgrenska University Hospital, 413 45, Gothenburg, Sweden
| |
Collapse
|
11
|
Safaei M, Sundararajan EA, Driss M, Boulila W, Shapi'i A. A systematic literature review on obesity: Understanding the causes & consequences of obesity and reviewing various machine learning approaches used to predict obesity. Comput Biol Med 2021; 136:104754. [PMID: 34426171 DOI: 10.1016/j.compbiomed.2021.104754] [Citation(s) in RCA: 167] [Impact Index Per Article: 55.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2021] [Revised: 08/05/2021] [Accepted: 08/05/2021] [Indexed: 01/02/2023]
Abstract
Obesity is considered a principal public health concern and ranked as the fifth foremost reason for death globally. Overweight and obesity are one of the main lifestyle illnesses that leads to further health concerns and contributes to numerous chronic diseases, including cancers, diabetes, metabolic syndrome, and cardiovascular diseases. The World Health Organization also predicted that 30% of death in the world will be initiated with lifestyle diseases in 2030 and can be stopped through the suitable identification and addressing of associated risk factors and behavioral involvement policies. Thus, detecting and diagnosing obesity as early as possible is crucial. Therefore, the machine learning approach is a promising solution to early predictions of obesity and the risk of overweight because it can offer quick, immediate, and accurate identification of risk factors and condition likelihoods. The present study conducted a systematic literature review to examine obesity research and machine learning techniques for the prevention and treatment of obesity from 2010 to 2020. Accordingly, 93 papers are identified from the review articles as primary studies from an initial pool of over 700 papers addressing obesity. Consequently, this study initially recognized the significant potential factors that influence and cause adult obesity. Next, the main diseases and health consequences of obesity and overweight are investigated. Ultimately, this study recognized the machine learning methods that can be used for the prediction of obesity. Finally, this study seeks to support decision-makers looking to understand the impact of obesity on health in the general population and identify outcomes that can be used to guide health authorities and public health to further mitigate threats and effectively guide obese people globally.
Collapse
Affiliation(s)
- Mahmood Safaei
- Centre for Software Technology and Management, Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia (UKM), Bangi, 43600, Selangor, Malaysia
| | - Elankovan A Sundararajan
- Centre for Software Technology and Management, Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia (UKM), Bangi, 43600, Selangor, Malaysia.
| | - Maha Driss
- RIADI Laboratory, University of Manouba, Manouba, Tunisia; College of Computer Science and Engineering, Taibah University, Medina, Saudi Arabia
| | - Wadii Boulila
- RIADI Laboratory, University of Manouba, Manouba, Tunisia; College of Computer Science and Engineering, Taibah University, Medina, Saudi Arabia
| | - Azrulhizam Shapi'i
- Center for Artificial Intelligence Technology, Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia (UKM), Bangi, 43600, Selangor, Malaysia
| |
Collapse
|
12
|
Villena F, Pérez J, Lagos R, Dunstan J. Supporting the classification of patients in public hospitals in Chile by designing, deploying and validating a system based on natural language processing. BMC Med Inform Decis Mak 2021; 21:208. [PMID: 34210317 PMCID: PMC8252255 DOI: 10.1186/s12911-021-01565-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Accepted: 06/23/2021] [Indexed: 11/22/2022] Open
Abstract
Background In Chile, a patient needing a specialty consultation or surgery has to first be referred by a general practitioner, then placed on a waiting list. The Explicit Health Guarantees (GES in Spanish) ensures, by law, the maximum time to solve 85 health problems. Usually, a health professional manually verifies if each referral, written in natural language, corresponds or not to a GES-covered disease. An error in this classification is catastrophic for patients, as it puts them on a non-prioritized waiting list, characterized by prolonged waiting times. Methods To support the manual process, we developed and deployed a system that automatically classifies referrals as GES-covered or not using historical data. Our system is based on word embeddings specially trained for clinical text produced in Chile. We used a vector representation of the reason for referral and patient's age as features for training machine learning models using human-labeled historical data. We constructed a ground truth dataset combining classifications made by three healthcare experts, which was used to validate our results. Results The best performing model over ground truth reached an AUC score of 0.94, with a weighted F1-score of 0.85 (0.87 in precision and 0.86 in recall). During seven months of continuous and voluntary use, the system has amended 87 patient misclassifications. Conclusion This system is a result of a collaboration between technical and clinical experts, and the design of the classifier was custom-tailored for a hospital's clinical workflow, which encouraged the voluntary use of the platform. Our solution can be easily expanded across other hospitals since the registry is uniform in Chile.
Collapse
Affiliation(s)
- Fabián Villena
- Center for Mathematical Modeling - CNRS UMI2807, Faculty of Physical and Mathematical Sciences, University of Chile, Santiago, Chile.,Center for Medical Informatics and Telemedicine, ICBM, Faculty of Medicine, University of Chile, Santiago, Chile
| | - Jorge Pérez
- Computer Science Department, Faculty of Physical and Mathematical Sciences, University of Chile, Santiago, Chile.,Millennium Institute for Foundational Research on Data, Santiago, Chile
| | - René Lagos
- Digital Health Unit, South East Metropolitan Health Service, Santiago, Chile
| | - Jocelyn Dunstan
- Center for Mathematical Modeling - CNRS UMI2807, Faculty of Physical and Mathematical Sciences, University of Chile, Santiago, Chile. .,Center for Medical Informatics and Telemedicine, ICBM, Faculty of Medicine, University of Chile, Santiago, Chile.
| |
Collapse
|
13
|
Delnevo G, Mancini G, Roccetti M, Salomoni P, Trombini E, Andrei F. The Prediction of Body Mass Index from Negative Affectivity through Machine Learning: A Confirmatory Study. SENSORS (BASEL, SWITZERLAND) 2021; 21:2361. [PMID: 33805257 PMCID: PMC8037317 DOI: 10.3390/s21072361] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/17/2021] [Revised: 03/17/2021] [Accepted: 03/26/2021] [Indexed: 11/16/2022]
Abstract
This study investigates on the relationship between affect-related psychological variables and Body Mass Index (BMI). We have utilized a novel method based on machine learning (ML) algorithms that forecast unobserved BMI values based on psychological variables, like depression, as predictors. We have employed various machine learning algorithms, including gradient boosting and random forest, with psychological variables relative to 221 subjects to predict both the BMI values and the BMI status (normal, overweight, and obese) of those subjects. We have found that the psychological variables in use allow one to predict both the BMI values (with a mean absolute error of 5.27-5.50) and the BMI status with an accuracy of over 80% (metric: F1-score). Further, our study has also confirmed the particular efficacy of psychological variables of negative type, such as depression for example, compared to positive ones, to achieve excellent predictive BMI values.
Collapse
Affiliation(s)
- Giovanni Delnevo
- Department of Computer Science and Engineering, University of Bologna, 40127 Bologna, Italy; (G.D.); (P.S.)
| | - Giacomo Mancini
- Department of Education, University of Bologna, 40127 Bologna, Italy;
| | - Marco Roccetti
- Department of Computer Science and Engineering, University of Bologna, 40127 Bologna, Italy; (G.D.); (P.S.)
| | - Paola Salomoni
- Department of Computer Science and Engineering, University of Bologna, 40127 Bologna, Italy; (G.D.); (P.S.)
| | - Elena Trombini
- Department of Psychology, University of Bologna, 40127 Bologna, Italy;
| | - Federica Andrei
- Department of Psychology, University of Bologna, 40127 Bologna, Italy;
| |
Collapse
|
14
|
Identification of Risk Factors Associated with Obesity and Overweight-A Machine Learning Overview. SENSORS 2020; 20:s20092734. [PMID: 32403349 PMCID: PMC7248873 DOI: 10.3390/s20092734] [Citation(s) in RCA: 63] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/18/2020] [Revised: 05/04/2020] [Accepted: 05/07/2020] [Indexed: 02/06/2023]
Abstract
Social determining factors such as the adverse influence of globalization, supermarket growth, fast unplanned urbanization, sedentary lifestyle, economy, and social position slowly develop behavioral risk factors in humans. Behavioral risk factors such as unhealthy habits, improper diet, and physical inactivity lead to physiological risks, and "obesity/overweight" is one of the consequences. "Obesity and overweight" are one of the major lifestyle diseases that leads to other health conditions, such as cardiovascular diseases (CVDs), chronic obstructive pulmonary disease (COPD), cancer, diabetes type II, hypertension, and depression. It is not restricted within the age and socio-economic background of human beings. The "World Health Organization" (WHO) has anticipated that 30% of global death will be caused by lifestyle diseases by 2030 and it can be prevented with the appropriate identification of associated risk factors and behavioral intervention plans. Health behavior change should be given priority to avoid life-threatening damages. The primary purpose of this study is not to present a risk prediction model but to provide a review of various machine learning (ML) methods and their execution using available sample health data in a public repository related to lifestyle diseases, such as obesity, CVDs, and diabetes type II. In this study, we targeted people, both male and female, in the age group of >20 and <60, excluding pregnancy and genetic factors. This paper qualifies as a tutorial article on how to use different ML methods to identify potential risk factors of obesity/overweight. Although institutions such as "Center for Disease Control and Prevention (CDC)" and "National Institute for Clinical Excellence (NICE)" guidelines work to understand the cause and consequences of overweight/obesity, we aimed to utilize the potential of data science to assess the correlated risk factors of obesity/overweight after analyzing the existing datasets available in "Kaggle" and "University of California, Irvine (UCI) database", and to check how the potential risk factors are changing with the change in body-energy imbalance with data-visualization techniques and regression analysis. Analyzing existing obesity/overweight related data using machine learning algorithms did not produce any brand-new risk factors, but it helped us to understand: (a) how are identified risk factors related to weight change and how do we visualize it? (b) what will be the nature of the data (potential monitorable risk factors) to be collected over time to develop our intended eCoach system for the promotion of a healthy lifestyle targeting "obesity and overweight" as a study case in the future? (c) why have we used the existing "Kaggle" and "UCI" datasets for our preliminary study? (d) which classification and regression models are performing better with a corresponding limited volume of the dataset following performance metrics?
Collapse
|
15
|
Kim C, Costello FJ, Lee KC, Li Y, Li C. Predicting Factors Affecting Adolescent Obesity Using General Bayesian Network and What-If Analysis. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2019; 16:ijerph16234684. [PMID: 31775234 PMCID: PMC6926973 DOI: 10.3390/ijerph16234684] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/04/2019] [Revised: 11/21/2019] [Accepted: 11/22/2019] [Indexed: 12/22/2022]
Abstract
With the remarkable improvement in people’s socioeconomic living standards around the world, adolescent obesity has increasingly become an important public health issue that cannot be ignored. Thus, we have implemented its use in an attempt to explore the viability of scenario-based simulations through the use of a data mining approach. In doing so, we wanted to explore the merits of using a General Bayesian Network (GBN) with What-If analysis while exploring how it can be utilized in other areas of public health. We analyzed data from the 2017 Korean Youth Health Behavior Survey conducted directly by the Korea Centers for Disease Control & Prevention, including 19 attributes and 11,206 individual data points. Our simulations found that by manipulating the amount of pocket money-between $60 and $80-coupled with a low-income background, it has a high potential to increase obesity compared with other simulated factors. Additionally, when we manipulated an increase in studying time with a mediocre academic performance, it was found to potentially increase pressure on adolescents, which subsequently led to an increased obesity outcome. Lastly, we found that when we manipulated an increase in a father’s education level while manipulating a decrease in mother’s education level, this had a large effect on the potential adolescent obesity level. Although obesity was the chosen case, this paper acts more as a proof of concept in analyzing public health through GBN and What-If analysis. Therefore, it aims to guide health professionals into potentially expanding their ability to simulate certain outcomes based on predicted changes in certain factors concerning future public health issues.
Collapse
Affiliation(s)
- Cheong Kim
- SKK Business School, Sungkyunkwan University, Seoul 03063, Korea; (C.K.); (F.J.C.); (Y.L.); (C.L.)
- Airport Business Analytics, Economics Department, Airport Council International (ACI) World, 800 rue du Square Victoria, Suite 1810, Montreal, QC H4Z 1G8, Canada
| | - Francis Joseph Costello
- SKK Business School, Sungkyunkwan University, Seoul 03063, Korea; (C.K.); (F.J.C.); (Y.L.); (C.L.)
| | - Kun Chang Lee
- SKK Business School, Sungkyunkwan University, Seoul 03063, Korea; (C.K.); (F.J.C.); (Y.L.); (C.L.)
- Department of Health Sciences & Technology, Samsung Advanced Institute for Health Sciences & Technology (SAIHST), Sungkyunkwan University, Seoul 06355, Korea
- Creativity Science Research Institute (CSRI), Sungkyunkwan University, Seoul 03063, Korea
- Correspondence:
| | - Yuan Li
- SKK Business School, Sungkyunkwan University, Seoul 03063, Korea; (C.K.); (F.J.C.); (Y.L.); (C.L.)
| | - Chenyao Li
- SKK Business School, Sungkyunkwan University, Seoul 03063, Korea; (C.K.); (F.J.C.); (Y.L.); (C.L.)
| |
Collapse
|