1
|
Narayanan A, Stewart T, Duncan S, Pacheco G. Using machine learning to explore the efficacy of administrative variables in prediction of subjective-wellbeing outcomes in New Zealand. Sci Rep 2025; 15:6831. [PMID: 40000735 PMCID: PMC11861262 DOI: 10.1038/s41598-025-90852-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2024] [Accepted: 02/17/2025] [Indexed: 02/27/2025] Open
Abstract
The growing acknowledgment of population wellbeing as a key indicator of societal prosperity has propelled governments worldwide to devise policies aimed at improving their citizens' overall wellbeing. In New Zealand, the General Social Survey provides wellbeing metrics for a representative subset of the population (~ 10,000 individuals). However, this sample size only provides a surface-level understanding of the country's wellbeing landscape, limiting our ability to comprehensively assess the impacts of governmental policies, particularly on smaller subgroups who may be of high policy interest. To overcome this challenge, comprehensive population-level wellbeing data is imperative. Leveraging New Zealand's Integrated Data Infrastructure, this study developed and validated the efficacy of three predictive models-Stepwise Linear Regression, Elastic Net Regression, and Random Forest-for predicting subjective wellbeing outcomes (life satisfaction, life worthwhileness, family wellbeing, and mental wellbeing) using census-level administrative variables as predictors. Our results demonstrated the Random Forest model's effectiveness in predicting subjective wellbeing, reflected in low RMSE values (~ 1.5). Nonetheless, the models exhibited low R2 values, suggesting limited explanatory capacity for the nuanced variability in outcome variables. While achieving reasonable predictive accuracy, our findings underscore the necessity for further model refinements to enhance the prediction of subjective wellbeing outcomes.
Collapse
Affiliation(s)
- Anantha Narayanan
- School of Sport and Recreation, Auckland University of Technology, Private Bag 92006, Auckland, 1142, New Zealand
| | - Tom Stewart
- School of Sport and Recreation, Auckland University of Technology, Private Bag 92006, Auckland, 1142, New Zealand.
| | - Scott Duncan
- School of Sport and Recreation, Auckland University of Technology, Private Bag 92006, Auckland, 1142, New Zealand
| | - Gail Pacheco
- Faculty of Business, Economics and Law, Auckland University of Technology, Auckland, New Zealand
| |
Collapse
|
2
|
Rozier M, Scroggins S, Loux T, Shacham E. Personal Location as Health-Related Data: Public Knowledge, Public Concern, and Personal Action. VALUE IN HEALTH : THE JOURNAL OF THE INTERNATIONAL SOCIETY FOR PHARMACOECONOMICS AND OUTCOMES RESEARCH 2023; 26:1314-1320. [PMID: 37236397 DOI: 10.1016/j.jval.2023.05.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/12/2023] [Revised: 04/13/2023] [Accepted: 05/11/2023] [Indexed: 05/28/2023]
Abstract
OBJECTIVES Personal health information (PHI), including health status and behaviors, are often associated with personal locations. Smart devices and other technologies routinely collect personal location. Therefore, technologies collecting personal location do not just create generic questions of privacy, but specific concerns related to PHI. METHODS To assess public opinion on the relationship between health, personal location, and privacy, a national survey of US residents was administered online in March 2020. Respondents answered questions about their use of smart devices and knowledge of location tracking. They also identified which of the locations they could visit were most private and how to balance possibilities that locations may be private but can also be useful to share. RESULTS Of respondents that used smart devices (n = 688), a majority (71.1%) indicated they knew they had applications tracking their location, with respondents who were younger (P < .001) and male (P = .002) and with more education (P = .045) more likely to indicate "yes." When all respondents (N = 828) identified the locations on a hypothetical map they felt were most private, health-related locations (substance use treatment center, hospital, urgent care) were the most selected. CONCLUSIONS The historical notion of PHI is no longer adequate and the public need greater education on how data from smart devices may be used to predict health status and behaviors. The COVID-19 pandemic brought increased attention to personal location as a tool for public health. Given healthcare's dependence upon trust, the field needs to lead the conversation and be viewed as protecting privacy while usefully leveraging location data.
Collapse
Affiliation(s)
- Michael Rozier
- Department of Health Management and Policy, Saint Louis University, St. Louis, MO, USA.
| | - Steve Scroggins
- Department of Health Behavior and Health Education, Saint Louis University, St. Louis, MO, USA; Taylor Geospatial Institute, Saint Louis University, St. Louis, MO, USA
| | - Travis Loux
- Department of Epidemiology and Biostatistics, Saint Louis University, St. Louis, MO, USA
| | - Enbal Shacham
- Department of Health Behavior and Health Education, Saint Louis University, St. Louis, MO, USA; Taylor Geospatial Institute, Saint Louis University, St. Louis, MO, USA
| |
Collapse
|
3
|
Leist AK, Klee M, Kim JH, Rehkopf DH, Bordas SPA, Muniz-Terrera G, Wade S. Mapping of machine learning approaches for description, prediction, and causal inference in the social and health sciences. SCIENCE ADVANCES 2022; 8:eabk1942. [PMID: 36260666 PMCID: PMC9581488 DOI: 10.1126/sciadv.abk1942] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/28/2021] [Accepted: 09/01/2022] [Indexed: 05/20/2023]
Abstract
Machine learning (ML) methodology used in the social and health sciences needs to fit the intended research purposes of description, prediction, or causal inference. This paper provides a comprehensive, systematic meta-mapping of research questions in the social and health sciences to appropriate ML approaches by incorporating the necessary requirements to statistical analysis in these disciplines. We map the established classification into description, prediction, counterfactual prediction, and causal structural learning to common research goals, such as estimating prevalence of adverse social or health outcomes, predicting the risk of an event, and identifying risk factors or causes of adverse outcomes, and explain common ML performance metrics. Such mapping may help to fully exploit the benefits of ML while considering domain-specific aspects relevant to the social and health sciences and hopefully contribute to the acceleration of the uptake of ML applications to advance both basic and applied social and health sciences research.
Collapse
Affiliation(s)
- Anja K. Leist
- Department of Social Sciences, Institute for Research on Socio-Economic Inequality (IRSEI), University of Luxembourg, Esch-sur-Alzette, Luxembourg
- Corresponding author.
| | - Matthias Klee
- Department of Social Sciences, Institute for Research on Socio-Economic Inequality (IRSEI), University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Jung Hyun Kim
- Department of Social Sciences, Institute for Research on Socio-Economic Inequality (IRSEI), University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - David H. Rehkopf
- Department of Epidemiology and Population Health, Stanford University, Palo Alto, CA, USA
| | | | - Graciela Muniz-Terrera
- Centre for Dementia Prevention, University of Edinburgh, Edinburgh, UK
- Ohio University, Athens, OH, USA
| | - Sara Wade
- School of Mathematics, University of Edinburgh, Edinburgh, UK
| |
Collapse
|
4
|
Demmler JC, Gosztonyi Á, Du Y, Leinonen M, Ruotsalainen L, Järvi L, Ala-Mantila S. A novel approach of creating sustainable urban planning solutions that optimise the local air quality and environmental equity in Helsinki, Finland: The CouSCOUS study protocol. PLoS One 2021; 16:e0260009. [PMID: 34855792 PMCID: PMC8638916 DOI: 10.1371/journal.pone.0260009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Accepted: 10/30/2021] [Indexed: 11/26/2022] Open
Abstract
Background Air pollution is one of the major environmental challenges cities worldwide face today. Planning healthy environments for all future populations, whilst considering the ongoing demand for urbanisation and provisions needed to combat climate change, remains a difficult task. Objective To combine artificial intelligence (AI), atmospheric and social sciences to provide urban planning solutions that optimise local air quality by applying novel methods and taking into consideration population structures and traffic flows. Methods We will use high-resolution spatial data and linked electronic population cohort for Helsinki Metropolitan Area (Finland) to model (a) population dynamics and urban inequality related to air pollution; (b) detailed aerosol dynamics, aerosol and gas-phase chemistry together with detailed flow characteristics; (c) high-resolution traffic flow addressing dynamical changes at the city environment, such as accidents, construction work and unexpected congestion. Finally, we will fuse the information resulting from these models into an optimal city planning model balancing air quality, comfort, accessibility and travelling efficiency.
Collapse
Affiliation(s)
- Joanne C. Demmler
- Faculty of Biological and Environmental Sciences, University of Helsinki, Helsinki, Finland
- Helsinki Institute of Sustainability Science (HELSUS), University of Helsinki, Helsinki, Finland
- * E-mail:
| | - Ákos Gosztonyi
- Faculty of Biological and Environmental Sciences, University of Helsinki, Helsinki, Finland
- Helsinki Institute of Sustainability Science (HELSUS), University of Helsinki, Helsinki, Finland
- Helsinki Inequality Initiative (INEQ), University of Helsinki, Helsinki, Finland
| | - Yaxing Du
- Institute for Atmospheric and Earth System Research (INAR), University of Helsinki, Helsinki, Finland
| | - Matti Leinonen
- Department of Computer Science, University of Helsinki, Helsinki, Finland
| | - Laura Ruotsalainen
- Helsinki Institute of Sustainability Science (HELSUS), University of Helsinki, Helsinki, Finland
- Department of Computer Science, University of Helsinki, Helsinki, Finland
| | - Leena Järvi
- Helsinki Institute of Sustainability Science (HELSUS), University of Helsinki, Helsinki, Finland
- Institute for Atmospheric and Earth System Research (INAR), University of Helsinki, Helsinki, Finland
| | - Sanna Ala-Mantila
- Faculty of Biological and Environmental Sciences, University of Helsinki, Helsinki, Finland
- Helsinki Institute of Sustainability Science (HELSUS), University of Helsinki, Helsinki, Finland
| |
Collapse
|
5
|
Sampa MB, Hossain MN, Hoque MR, Islam R, Yokota F, Nishikitani M, Ahmed A. Blood Uric Acid Prediction With Machine Learning: Model Development and Performance Comparison. JMIR Med Inform 2020; 8:e18331. [PMID: 33030442 PMCID: PMC7582147 DOI: 10.2196/18331] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2020] [Revised: 07/16/2020] [Accepted: 08/10/2020] [Indexed: 02/06/2023] Open
Abstract
Background Uric acid is associated with noncommunicable diseases such as cardiovascular diseases, chronic kidney disease, coronary artery disease, stroke, diabetes, metabolic syndrome, vascular dementia, and hypertension. Therefore, uric acid is considered to be a risk factor for the development of noncommunicable diseases. Most studies on uric acid have been performed in developed countries, and the application of machine-learning approaches in uric acid prediction in developing countries is rare. Different machine-learning algorithms will work differently on different types of data in various diseases; therefore, a different investigation is needed for different types of data to identify the most accurate algorithms. Specifically, no study has yet focused on the urban corporate population in Bangladesh, despite the high risk of developing noncommunicable diseases for this population. Objective The aim of this study was to develop a model for predicting blood uric acid values based on basic health checkup test results, dietary information, and sociodemographic characteristics using machine-learning algorithms. The prediction of health checkup test measurements can be very helpful to reduce health management costs. Methods Various machine-learning approaches were used in this study because clinical input data are not completely independent and exhibit complex interactions. Conventional statistical models have limitations to consider these complex interactions, whereas machine learning can consider all possible interactions among input data. We used boosted decision tree regression, decision forest regression, Bayesian linear regression, and linear regression to predict personalized blood uric acid based on basic health checkup test results, dietary information, and sociodemographic characteristics. We evaluated the performance of these five widely used machine-learning models using data collected from 271 employees in the Grameen Bank complex of Dhaka, Bangladesh. Results The mean uric acid level was 6.63 mg/dL, indicating a borderline result for the majority of the sample (normal range <7.0 mg/dL). Therefore, these individuals should be monitoring their uric acid regularly. The boosted decision tree regression model showed the best performance among the models tested based on the root mean squared error of 0.03, which is also better than that of any previously reported model. Conclusions A uric acid prediction model was developed based on personal characteristics, dietary information, and some basic health checkup measurements. This model will be useful for improving awareness among high-risk individuals and populations, which can help to save medical costs. A future study could include additional features (eg, work stress, daily physical activity, alcohol intake, eating red meat) in improving prediction.
Collapse
Affiliation(s)
- Masuda Begum Sampa
- Department of Advanced Information Technology, Kyushu University, Fukuoka, Japan
| | - Md Nazmul Hossain
- Department of Marketing, Faculty of Business Studies, University of Dhaka, Dhaka, Bangladesh
| | - Md Rakibul Hoque
- School of Business, Emporia State University, Kansas, KS, United States
| | - Rafiqul Islam
- Medical Information Center, Kyushu University Hospital, Fukuoka, Japan
| | - Fumihiko Yokota
- Institute of Decision Science for a Sustainable Society, Kyushu University, Fukuoka, Japan
| | | | - Ashir Ahmed
- Department of Advanced Information Technology, Kyushu University, Fukuoka, Japan
| |
Collapse
|
6
|
Abstract
Infectious diseases are caused by microorganisms belonging to the class of bacteria, viruses, fungi, or parasites. These pathogens are transmitted, directly or indirectly, and can lead to epidemics or even pandemics. The resulting infection may lead to mild-to-severe symptoms such as life-threatening fever or diarrhea. Infectious diseases may be asymptomatic in some individuals but may lead to disastrous effects in others. Despite the advances in medicine, infectious diseases are a leading cause of death worldwide, especially in low-income countries. With the advent of mathematical tools, scientists are now able to better predict epidemics, understand the specificity of each pathogen, and identify potential targets for drug development. Artificial intelligence and its components have been widely publicized for their ability to better diagnose certain types of cancer from imaging data. This chapter aims at identifying potential applications of machine learning in the field of infectious diseases. We are deliberately focusing on key aspects of infection: diagnosis, transmission, response to treatment, and resistance. We are proposing the use of extreme values as an avenue of interest for future developments in the field of infectious diseases. This chapter covers a series of applications selectively chosen to showcase how artificial intelligence is moving the field of infectious disease further and how it helps institutions to better tackles them, especially in low-income countries.
Collapse
Affiliation(s)
- Said Agrebi
- Yobitrust, Technopark El Gazala, Ariana, Tunisia
| | - Anis Larbi
- Singapore Immunology Network, Agency for Science, Technology and Research, Singapore, Singapore,Department of Microbiology & Immunology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
| |
Collapse
|
7
|
Dunstan J, Aguirre M, Bastías M, Nau C, Glass TA, Tobar F. Predicting nationwide obesity from food sales using machine learning. Health Informatics J 2019; 26:652-663. [PMID: 31106648 DOI: 10.1177/1460458219845959] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
The obesity epidemic progresses everywhere across the globe, and implementing frequent nationwide surveys to measure the percentage of obese population is costly. Conversely, country-level food sales information can be accessed inexpensively through different suppliers on a regular basis. This study applies a methodology to predict obesity prevalence at the country-level based on national sales of a small subset of food and beverage categories. Three machine learning algorithms for nonlinear regression were implemented using purchase and obesity prevalence data from 79 countries: support vector machines, random forests and extreme gradient boosting. The proposed method was validated in terms of both the absolute prediction error and the proportion of countries for which the obesity prevalence was predicted satisfactorily. We found that the most-relevant food category to predict obesity is baked goods and flours, followed by cheese and carbonated drinks.
Collapse
|
8
|
Improving age measurement in low- and middle-income countries through computer vision: A test in Senegal. DEMOGRAPHIC RESEARCH 2019. [DOI: 10.4054/demres.2019.40.9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022] Open
|