1
|
Lee SJ, Ju JT, Lee JJ, Song CK, Shin SA, Jung HJ, Shin HJ, Choi SD. Mapping nationwide concentrations of sulfate and nitrate in ambient PM 2.5 in South Korea using machine learning with ground observation data. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 926:171884. [PMID: 38527532 DOI: 10.1016/j.scitotenv.2024.171884] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Revised: 02/24/2024] [Accepted: 03/20/2024] [Indexed: 03/27/2024]
Abstract
Particulate matter (PM) is a major air pollutant in Northeast Asia, with frequent high PM episodes. To investigate the nationwide spatial distribution maps of PM2.5 and secondary inorganic aerosols in South Korea, prediction models for mapping SO42- and NO3- concentrations in PM2.5 were developed using machine learning with ground-based observation data. Specifically, the random forest algorithm was used in this study to predict the SO42- and NO3- concentrations at 548 air quality monitoring stations located within the representative radii of eight intensive air quality monitoring stations. The average concentrations of PM2.5, SO42-, and NO3- across the entire nation were 17.2 ± 2.8, 3.0 ± 0.6, and 3.4 ± 1.2 μg/m3, respectively. The spatial distributions of SO42- and NO3- concentrations in 2021 revealed elevated concentrations in both the western and central regions of South Korea. This result suggests that SO42- concentrations were primarily influenced by industrial activities rather than vehicle emissions, whereas NO3- concentrations were more associated with vehicle emissions. During a high PM2.5 event (November 19-21, 2021), the concentration of SO42- was primarily influenced by SOX emissions from China, while the concentration of NO3- was affected by NOX emissions from both China and Korea. The methodology developed in this study can be used to explore the chemical characteristics of PM2.5 with high spatiotemporal resolution. It can also provide valuable insights for the nationwide mitigation of secondary PM2.5 pollution.
Collapse
Affiliation(s)
- Sang-Jin Lee
- Department of Civil, Urban, Earth, and Environmental Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, 44919, Republic of Korea
| | - Jeong-Tae Ju
- Department of Civil, Urban, Earth, and Environmental Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, 44919, Republic of Korea
| | - Jong-Jae Lee
- Research and Management Center for Particulate Matter in the Southeast Region of Korea, Ulsan National Institute of Science and Technology (UNIST), Ulsan, 44919, Republic of Korea
| | - Chang-Keun Song
- Department of Civil, Urban, Earth, and Environmental Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, 44919, Republic of Korea; Research and Management Center for Particulate Matter in the Southeast Region of Korea, Ulsan National Institute of Science and Technology (UNIST), Ulsan, 44919, Republic of Korea; Graduate School of Carbon Neutrality, Ulsan National Institute of Science and Technology (UNIST), Ulsan, 44919, Republic of Korea
| | - Sun-A Shin
- Climate and Air Quality Research Department, National Institute of Environmental Research, Incheon, 22689, Republic of Korea
| | - Hae-Jin Jung
- Climate and Air Quality Research Department, National Institute of Environmental Research, Incheon, 22689, Republic of Korea
| | - Hye Jung Shin
- Climate and Air Quality Research Department, National Institute of Environmental Research, Incheon, 22689, Republic of Korea
| | - Sung-Deuk Choi
- Department of Civil, Urban, Earth, and Environmental Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, 44919, Republic of Korea; Research and Management Center for Particulate Matter in the Southeast Region of Korea, Ulsan National Institute of Science and Technology (UNIST), Ulsan, 44919, Republic of Korea.
| |
Collapse
|
2
|
Lyu Y, Gao Y, Pang X, Sun S, Luo P, Cai D, Qin K, Wu Z, Wang B. Elucidating contributions of volatile organic compounds to ozone formation using random forest during COVID-19 pandemic: A case study in China. ENVIRONMENTAL POLLUTION (BARKING, ESSEX : 1987) 2024; 346:123532. [PMID: 38365075 DOI: 10.1016/j.envpol.2024.123532] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Revised: 11/10/2023] [Accepted: 02/07/2024] [Indexed: 02/18/2024]
Abstract
Ozone has been reported to increase despite nitrogen oxides reductions during the COVID-19 pandemic, and ozone formation needs to be revisited using volatile organic compounds (VOCs), which are rarely measured during the pandemic. Here, a total of 98 VOCs species were monitored in an economy-active city in China from January 2021 to August 2022 to assess contributions to ozone formation during the pandemic. Total VOCs concentrations were 35.55 ± 21.47 ppb during the entire period, among which alkanes account for the largest fraction (13.78 ppb, 38.0%), followed by aromatics (6.16 ppb, 16.8%) and oxygenated VOCs (OVOCs, 5.69 ppb, 15.7%). Most VOCs groups (e.g., alkenes, OVOCs) and individual species (e.g., isoprene, methyl vinyl ketone) display obvious seasonal and diurnal variations, which are related to their sources and reactivities. No weekend effects of VOCs suggest limited influences from traffic emissions during pandemic. Aromatics and alkenes are the major contributors (39% and 33%) to ozone formation potential, largely driven by o/m/p-xylene (21%), ethylene (15%), toluene (9%). Secondary organic aerosol formation potential is dominated by toluene (>50%) despite its low proportion (5%). Further inclusion of VOCs and meteorology in the Random Forest model shows good ozone prediction performance (R2 = 0.77-0.86, RMSE = 11.95-19.91 μg/m3, MAE = 8.89-14.58 μg/m3). VOCs and NO2 contribute >50% of total importance with the largest difference in importance ratio of VOCs/NO2 in the summer and winter, implying ozone formation regime may vary. No seasonal variations in importance of meteorology are observed, while importance of other variables (e.g., PM2.5) is highest in the summer. This work identifies critical VOCs groups and species for ozone formation during the pandemic, and demonstrates the feasibility of machine learning algorithms in elucidation of ozone formation mechanisms.
Collapse
Affiliation(s)
- Yan Lyu
- College of Environment, Zhejiang University of Technology, Hangzhou, 310014, China; School of Environment and Spatial Informatics, China University of Mining and Technology, Xuzhou, 221116, China; Shaoxing Research Institute, Zhejiang University of Technology, Shaoxing, 312077, China
| | - Yibu Gao
- College of Environment, Zhejiang University of Technology, Hangzhou, 310014, China
| | - Xiaobing Pang
- College of Environment, Zhejiang University of Technology, Hangzhou, 310014, China; Shaoxing Research Institute, Zhejiang University of Technology, Shaoxing, 312077, China.
| | - Songhua Sun
- Shaoxing Ecological and Environmental Monitoring Center of Zhejiang Province, Shaoxing, 312000, China
| | - Peisong Luo
- Shaoxing Ecological and Environmental Monitoring Center of Zhejiang Province, Shaoxing, 312000, China
| | - Dongmei Cai
- Department of Environment Sciences and Engineering, Fudan University, Shanghai, 200433, China
| | - Kai Qin
- School of Environment and Spatial Informatics, China University of Mining and Technology, Xuzhou, 221116, China
| | - Zhentao Wu
- College of Environment, Zhejiang University of Technology, Hangzhou, 310014, China
| | - Baozhen Wang
- Green Intelligence Environmental School, Yangtze Normal University, Chongqing, 408100, China
| |
Collapse
|
3
|
Du X, Yuan Z, Huang D, Ma W, Yang J, Mo J. Importance of secondary decomposition in the accurate prediction of daily-scale ozone pollution by machine learning. THE SCIENCE OF THE TOTAL ENVIRONMENT 2023; 904:166963. [PMID: 37696411 DOI: 10.1016/j.scitotenv.2023.166963] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Revised: 08/17/2023] [Accepted: 09/08/2023] [Indexed: 09/13/2023]
Abstract
Machine learning (ML) models have been proven as a reliable tool in predicting ambient pollution concentrations at various places in the world. However, their performance in predicting the maximum daily 8-h averaged ozone (MDA8 O3), the metric often used for O3 pollution assessment and management, is relatively poorer. This is largely resulted from more irregular data fluctuations of the MDA8 O3 levels governed collectively by the synoptic condition, local photochemistry, and long-range transport. In order to improve the prediction accuracy of MDA8 O3, this study developed a secondary decomposition ML model framework which coupled the complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) as the primary decomposition, the variational mode decomposition (VMD) as secondary decomposition, and the gate recurrent unit (GRU) ML model. By applying this secondary decomposition model framework on MDA8 O3 prediction for the first time, we showed that the prediction accuracy of MDA8 O3 is largely improved from R2 of 0.46 and RMSE of 30.4 μg/m3 for GRU without decomposition to R2 of 0.91 and RMSE of 12.6 μg/m3 over the Pearl River Delta of China. We also found that the prediction accuracy rate of O3 pollution non-attainments, an essential indicator for initiating contingency O3 pollution control, improved greatly from 14.9 % for GRU without decomposition to 72.5 %. The performance of O3 pollution non-attainment prediction is relatively higher in southwestern PRD, which is mainly due to greater number and severity of O3 non-attainments in southwestern cities located downwind of the emission hotspot area at central PRD. This study underscored the importance of secondary decomposition in accurately predicting daily-scale O3 concentration and non-attainments over the PRD, which can be extended to other photochemically active region worldwide to improve their O3 prediction accuracy and assist in O3 contingency control.
Collapse
Affiliation(s)
- Xinyue Du
- School of Environment and Energy, South China University of Technology, Guangzhou 510006, China
| | - Zibing Yuan
- School of Environment and Energy, South China University of Technology, Guangzhou 510006, China.
| | - Daojian Huang
- South China Institute of Environmental Sciences, Ministry of Ecology and Environment of China, Guangzhou 510655, China.
| | - Wei Ma
- School of Environment and Energy, South China University of Technology, Guangzhou 510006, China
| | - Jun Yang
- School of Environment and Energy, South China University of Technology, Guangzhou 510006, China
| | - Jianbin Mo
- School of Environment and Energy, South China University of Technology, Guangzhou 510006, China
| |
Collapse
|
4
|
Wei J, Wang J, Li Z, Kondragunta S, Anenberg S, Wang Y, Zhang H, Diner D, Hand J, Lyapustin A, Kahn R, Colarco P, da Silva A, Ichoku C. Long-term mortality burden trends attributed to black carbon and PM 2·5 from wildfire emissions across the continental USA from 2000 to 2020: a deep learning modelling study. Lancet Planet Health 2023; 7:e963-e975. [PMID: 38056967 DOI: 10.1016/s2542-5196(23)00235-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2023] [Revised: 10/04/2023] [Accepted: 10/12/2023] [Indexed: 12/08/2023]
Abstract
BACKGROUND Long-term improvements in air quality and public health in the continental USA were disrupted over the past decade by increased fire emissions that potentially offset the decrease in anthropogenic emissions. This study aims to estimate trends in black carbon and PM2·5 concentrations and their attributable mortality burden across the USA. METHODS In this study, we derived daily concentrations of PM2·5 and its highly toxic black carbon component at a 1-km resolution in the USA from 2000 to 2020 via deep learning that integrated big data from satellites, models, and surface observations. We estimated the annual PM2·5-attributable and black carbon-attributable mortality burden at each 1-km2 grid using concentration-response functions collected from a national cohort study and a meta-analysis study, respectively. We investigated the spatiotemporal linear-regressed trends in PM2·5 and black carbon pollution and their associated premature deaths from 2000 to 2020, and the impact of wildfires on air quality and public health. FINDINGS Our results showed that PM2·5 and black carbon estimates are reliable, with sample-based cross-validated coefficients of determination of 0·82 and 0·80, respectively, for daily estimates (0·97 and 0·95 for monthly estimates). Both PM2·5 and black carbon in the USA showed significantly decreasing trends overall during 2000 to 2020 (22% decrease for PM2·5 and 11% decrease for black carbon), leading to a reduction of around 4200 premature deaths per year (95% CI 2960-5050). However, since 2010, the decreasing trends of fine particles and premature deaths have reversed to increase in the western USA (55% increase in PM2·5, 86% increase in black carbon, and increase of 670 premature deaths [460-810]), while remaining mostly unchanged in the eastern USA. The western USA showed large interannual fluctuations that were attributable to the increasing incidence of wildfires. Furthermore, the black carbon-to-PM2·5 mass ratio increased annually by 2·4% across the USA, mainly due to increasing wildfire emissions in the western USA and more rapid reductions of other components in the eastern USA, suggesting a potential increase in the relative toxicity of PM2·5. 100% of populated areas in the USA have experienced at least one day of PM2·5 pollution exceeding the daily air quality guideline level of 15 μg/m3 during 2000-2020, with 99% experiencing at least 7 days and 85% experiencing at least 30 days. The recent widespread wildfires have greatly increased the daily exposure risks in the western USA, and have also impacted the midwestern USA due to the long-range transport of smoke. INTERPRETATION Wildfires have become increasingly intensive and frequent in the western USA, resulting in a significant increase in smoke-related emissions in populated areas. This increase is likely to have contributed to a decline in air quality and an increase in attributable mortality. Reducing fire risk via effective policies besides mitigation of climate warming, such as wildfire prevention and management, forest restoration, and new revenue generation, could substantially improve air quality and public health in the coming decades. FUNDING National Aeronautics and Space Administration (NASA) Applied Science programme, NASA MODIS maintenance programme, NASA MAIA satellite mission programme, NASA GMAO core fund, National Oceanic and Atmospheric Administration (NOAA) GEO-XO project, NOAA Atmospheric Chemistry, Carbon Cycle, and Climate (AC4) programme, and NOAA Educational Partnership Program with Minority Serving Institutions.
Collapse
Affiliation(s)
- Jing Wei
- Department of Chemical and Biochemical Engineering, Iowa Technology Institute, Center for Global and Regional Environmental Research, University of Iowa, Iowa City, IA, USA; Department of Atmospheric and Oceanic Science, Earth System Science Interdisciplinary Center, University of Maryland, College Park, MD, USA.
| | - Jun Wang
- Department of Chemical and Biochemical Engineering, Iowa Technology Institute, Center for Global and Regional Environmental Research, University of Iowa, Iowa City, IA, USA.
| | - Zhanqing Li
- Department of Atmospheric and Oceanic Science, Earth System Science Interdisciplinary Center, University of Maryland, College Park, MD, USA.
| | - Shobha Kondragunta
- Center for Satellite Applications and Research, NOAA National Environmental Satellite, Data, and Information Service, College Park, MD, USA
| | - Susan Anenberg
- Department of Environmental and Occupational Health, George Washington University, Washington, DC, USA
| | - Yi Wang
- Department of Chemical and Biochemical Engineering, Iowa Technology Institute, Center for Global and Regional Environmental Research, University of Iowa, Iowa City, IA, USA
| | - Huanxin Zhang
- Department of Chemical and Biochemical Engineering, Iowa Technology Institute, Center for Global and Regional Environmental Research, University of Iowa, Iowa City, IA, USA
| | - David Diner
- Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA, USA
| | - Jenny Hand
- Cooperative Institute for Research in the Atmosphere, Colorado State University, Fort Collins, CO, USA
| | - Alexei Lyapustin
- Climate and Radiation Laboratory, NASA Goddard Space Flight Center, Greenbelt, MD, USA
| | - Ralph Kahn
- Climate and Radiation Laboratory, NASA Goddard Space Flight Center, Greenbelt, MD, USA
| | - Peter Colarco
- Atmospheric Chemistry and Dynamics Laboratory, NASA Goddard Space Flight Center, Greenbelt, MD, USA
| | - Arlindo da Silva
- Global Modeling and Assimilation Office, NASA Goddard Space Flight Center, Greenbelt, MD, USA
| | - Charles Ichoku
- Department of Geography and Environmental Systems, University of Maryland Baltimore County, Baltimore, MD, USA
| |
Collapse
|
5
|
Reid CE, Considine EM, Watson GL, Telesca D, Pfister GG, Jerrett M. Effect modification of the association between fine particulate air pollution during a wildfire event and respiratory health by area-level measures of socio-economic status, race/ethnicity, and smoking prevalence. ENVIRONMENTAL RESEARCH, HEALTH : ERH 2023; 1:025005. [PMID: 38332844 PMCID: PMC10852067 DOI: 10.1088/2752-5309/acc4e1] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/10/2024]
Abstract
Fine particulate air pollution (PM2.5) is decreasing in most areas of the United States, except for areas most affected by wildfires, where increasing trends in PM2.5 can be attributed to wildfire smoke. The frequency and duration of large wildfires and the length of the wildfire season have all increased in recent decades, partially due to climate change, and wildfire risk is projected to increase further in many regions including the western United States. Increasingly, empirical evidence suggests differential health effects from air pollution by class and race; however, few studies have investigated such differential health impacts from air pollution during a wildfire event. We investigated differential risk of respiratory health impacts during the 2008 northern California wildfires by a comprehensive list of socio-economic status (SES), race/ethnicity, and smoking prevalence variables. Regardless of SES level across nine measures of SES, we found significant associations between PM2.5 and asthma hospitalizations and emergency department (ED) visits during these wildfires. Differential respiratory health risk was found by SES for ED visits for chronic obstructive pulmonary disease where the highest risks were in ZIP codes with the lowest SES levels. Findings for differential effects by race/ethnicity were less consistent across health outcomes. We found that ZIP codes with higher prevalence of smokers had greater risk of ED visits for asthma and pneumonia. Our study suggests that public health efforts to decrease exposures to high levels of air pollution during wildfires should focus on lower SES communities.
Collapse
Affiliation(s)
- C E Reid
- Department of Geography, University of Colorado Boulder, Boulder, CO, United States of America
| | - E M Considine
- Department of Applied Math, University of Colorado Boulder, Boulder, CO, United States of America
- Current address: Department of Biostatistics, Harvard T.H. Chan School of Public Health, Harvard University., Boston, MA, United States of America
| | - G L Watson
- Department of Biostatistics, Fielding School of Public Health, University of California Los Angeles, Los Angeles, CA, United States of America
| | - D Telesca
- Department of Biostatistics, Fielding School of Public Health, University of California Los Angeles, Los Angeles, CA, United States of America
| | - G G Pfister
- National Center for Atmospheric Research, Boulder, CO, United States of America
| | - M Jerrett
- Department of Environmental Health Sciences, Fielding School of Public Health, University of California Los Angeles, Los Angeles, CA, United States of America
| |
Collapse
|
6
|
Baik SM, Hong KS, Park DJ. Application and utility of boosting machine learning model based on laboratory test in the differential diagnosis of non-COVID-19 pneumonia and COVID-19. Clin Biochem 2023; 118:110584. [PMID: 37211061 PMCID: PMC10197431 DOI: 10.1016/j.clinbiochem.2023.05.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Revised: 05/06/2023] [Accepted: 05/17/2023] [Indexed: 05/23/2023]
Abstract
BACKGROUND Non-Coronavirus disease 2019 (COVID-19) pneumonia and COVID-19 have similar clinical features but last for different periods, and consequently, require different treatment protocols. Therefore, they must be differentially diagnosed. This study uses artificial intelligence (AI) to classify the two forms of pneumonia using mainly laboratory test data. METHODS Various AI models are applied, including boosting models known for deftly solving classification problems. In addition, important features that affect the classification prediction performance are identified using the feature importance technique and SHapley Additive exPlanations method. Despite the data imbalance, the developed model exhibits robust performance. RESULTS eXtreme gradient boosting, category boosting, and light gradient boosted machine yield an area under the receiver operating characteristic of 0.99 or more, accuracy of 0.96-0.97, and F1-score of 0.96-0.97. In addition, D-dimer, eosinophil, glucose, aspartate aminotransferase, and basophil, which are rather nonspecific laboratory test results, are demonstrated to be important features in differentiating the two disease groups. CONCLUSIONS The boosting model, which excels in producing classification models using categorical data, excels in developing classification models using linear numerical data, such as laboratory tests. Finally, the proposed model can be applied in various fields to solve classification problems.
Collapse
Affiliation(s)
- Seung Min Baik
- Division of Critical Care Medicine, Department of Surgery, Ewha Womans University Mokdong Hospital, Ewha Womans University College of Medicine, Seoul, Korea; Department of Surgery, Korea University College of Medicine, Seoul, Korea
| | - Kyung Sook Hong
- Division of Critical Care Medicine, Department of Surgery, Ewha Womans University Seoul Hospital, Ewha Womans University College of Medicine, Seoul, Korea
| | - Dong Jin Park
- Department of Laboratory Medicine, Eunpyeong St. Mary's Hospital, College of Medicine, The Catholic University of Korea, Korea.
| |
Collapse
|
7
|
Baik SM, Hong KS, Park DJ. Deep learning approach for early prediction of COVID-19 mortality using chest X-ray and electronic health records. BMC Bioinformatics 2023; 24:190. [PMID: 37161395 PMCID: PMC10169101 DOI: 10.1186/s12859-023-05321-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Accepted: 05/05/2023] [Indexed: 05/11/2023] Open
Abstract
BACKGROUND An artificial-intelligence (AI) model for predicting the prognosis or mortality of coronavirus disease 2019 (COVID-19) patients will allow efficient allocation of limited medical resources. We developed an early mortality prediction ensemble model for COVID-19 using AI models with initial chest X-ray and electronic health record (EHR) data. RESULTS We used convolutional neural network (CNN) models (Inception-ResNet-V2 and EfficientNet) for chest X-ray analysis and multilayer perceptron (MLP), Extreme Gradient Boosting (XGBoost), and random forest (RF) models for EHR data analysis. The Gradient-weighted Class Activation Mapping and Shapley Additive Explanations (SHAP) methods were used to determine the effects of these features on COVID-19. We developed an ensemble model (Area under the receiver operating characteristic curve of 0.8698) using a soft voting method with weight differences for CNN, XGBoost, MLP, and RF models. To resolve the data imbalance, we conducted F1-score optimization by adjusting the cutoff values to optimize the model performance (F1 score of 0.77). CONCLUSIONS Our study is meaningful in that we developed an early mortality prediction model using only the initial chest X-ray and EHR data of COVID-19 patients. Early prediction of the clinical courses of patients is helpful for not only treatment but also bed management. Our results confirmed the performance improvement of the ensemble model achieved by combining AI models. Through the SHAP method, laboratory tests that indicate the factors affecting COVID-19 mortality were discovered, highlighting the importance of these tests in managing COVID-19 patients.
Collapse
Affiliation(s)
- Seung Min Baik
- Division of Critical Care Medicine, Department of Surgery, Ewha Womans University Mokdong Hospital, Ewha Womans University College of Medicine, Seoul, Korea
| | - Kyung Sook Hong
- Division of Critical Care Medicine, Department of Surgery, Ewha Womans University Seoul Hospital, Ewha Womans University College of Medicine, Seoul, Korea
| | - Dong Jin Park
- Department of Laboratory Medicine, Eunpyeong St. Mary's Hospital, College of Medicine, The Catholic University of Korea, 1021, Tongil-ro, Eunpyeong-gu, Seoul, 03312, Korea.
| |
Collapse
|
8
|
Wang L, Zhao Y, Shi J, Ma J, Liu X, Han D, Gao H, Huang T. Predicting ozone formation in petrochemical industrialized Lanzhou city by interpretable ensemble machine learning. ENVIRONMENTAL POLLUTION (BARKING, ESSEX : 1987) 2023; 318:120798. [PMID: 36464118 DOI: 10.1016/j.envpol.2022.120798] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/07/2022] [Revised: 11/24/2022] [Accepted: 11/29/2022] [Indexed: 06/17/2023]
Abstract
Ground-level ozone (O3) formation depends on meteorology, precursor emissions, and atmospheric chemistry. Understanding the key drivers behind the O3 formation and developing an accurate and efficient method for timely assessing the O3-VOCs-NOx relationships applicable in different O3 pollution events are essential. Here, we developed a novel machine learning ensemble model coupled with a Shapley additive explanation algorithm to predict the O3 formation regime and derive O3 formation sensitivity curves. The algorithm was tested for O3 events during the COVID-19 lockdown, a sandstorm event, and a heavy O3 pollution episode (maximum hourly O3 concentration >200 μg/m3) from 2019 to 2021. We show that increasing O3 concentrations during the COVID-19 lockdown and the heavy O3 pollution event were mainly caused by the photochemistry subject to local air quality and meteorological conditions. Influenced by the sandstorm weather, low O3 levels were mainly attributable to weak sunlight and low precursor levels. O3 formation sensitivity curves demonstrate that O3 formation in the study area was in a VOCs-sensitive regime. The VOCs-specific O3 sensitivity curves can also help make hybrid and timely strategies for O3 abatement. The results demonstrate that machine learning driven by observational data has the potential to be a very useful tool in predicting and interpreting O3 formation.
Collapse
Affiliation(s)
- Li Wang
- Collaborative Innovation Center for Western Ecological Safety, Lanzhou University, Lanzhou, 730000, China
| | - Yuan Zhao
- Key Laboratory for Environmental Pollution Prediction and Control, Gansu Province, College of Earth and Environmental Sciences, Lanzhou University, Lanzhou, 730000, China.
| | - Jinsen Shi
- Collaborative Innovation Center for Western Ecological Safety, Lanzhou University, Lanzhou, 730000, China
| | - Jianmin Ma
- Laboratory for Earth Surface Processes, College of Urban and Environmental Sciences, Peking University, Beijing, 100871, China
| | - Xiaoyue Liu
- Key Laboratory for Semi-Arid Climate Change of the Ministry of Education, College of Atmospheric Sciences, Lanzhou University, Lanzhou, 730000, China
| | - Dongliang Han
- Collaborative Innovation Center for Western Ecological Safety, Lanzhou University, Lanzhou, 730000, China
| | - Hong Gao
- Key Laboratory for Environmental Pollution Prediction and Control, Gansu Province, College of Earth and Environmental Sciences, Lanzhou University, Lanzhou, 730000, China
| | - Tao Huang
- Key Laboratory for Environmental Pollution Prediction and Control, Gansu Province, College of Earth and Environmental Sciences, Lanzhou University, Lanzhou, 730000, China
| |
Collapse
|
9
|
Sadeghi B, Ghahremanloo M, Mousavinezhad S, Lops Y, Pouyaei A, Choi Y. Contributions of meteorology to ozone variations: Application of deep learning and the Kolmogorov-Zurbenko filter. ENVIRONMENTAL POLLUTION (BARKING, ESSEX : 1987) 2022; 310:119863. [PMID: 35963387 DOI: 10.1016/j.envpol.2022.119863] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/14/2022] [Revised: 07/07/2022] [Accepted: 07/23/2022] [Indexed: 06/15/2023]
Abstract
From hourly ozone observations obtained from three regions⸻Houston, Dallas, and West Texas⸻we investigated the contributions of meteorology to changes in surface daily maximum 8-h average (MDA8) ozone from 2000 to 2019. We applied a deep convolutional neural network and Shapely additive explanation (SHAP) to examine the complex underlying nonlinearity between variations of surface ozone and meteorological factors. Results of the models showed that between 2000 and 2019, specific humidity (38% and 27%) and temperature (28% and 37%) contributed the most to ozone formation over the Houston and Dallas metropolitan areas, respectively. On the other hand, the results show that solar radiation (50%) strongly impacted ozone variation over West Texas during this time. Using a combination of the Kolmogorov-Zurbenko (KZ) filter and multiple linear regression, we also evaluated the influence of meteorology on ozone and quantified the contributions of meteorological parameters to trends in surface ozone formation. Our findings showed that in Houston and Dallas, meteorology influenced ozone variations to a large extent. The impacts of meteorology on West Texas, however, showed meteorological factors had fewer influences on ozone variabilities from 2000 to 2019. This study showed that SHAP analysis and the KZ approach can investigate the contributions of the meteorological factors on ozone concentrations and help policymakers enact effective ozone mitigation policies.
Collapse
Affiliation(s)
- Bavand Sadeghi
- Department of Earth and Atmospheric Science, University of Houston, Texas, USA
| | - Masoud Ghahremanloo
- Department of Earth and Atmospheric Science, University of Houston, Texas, USA
| | | | - Yannic Lops
- Department of Earth and Atmospheric Science, University of Houston, Texas, USA
| | - Arman Pouyaei
- Department of Earth and Atmospheric Science, University of Houston, Texas, USA
| | - Yunsoo Choi
- Department of Earth and Atmospheric Science, University of Houston, Texas, USA.
| |
Collapse
|
10
|
Lyu Y, Ju Q, Lv F, Feng J, Pang X, Li X. Spatiotemporal variations of air pollutants and ozone prediction using machine learning algorithms in the Beijing-Tianjin-Hebei region from 2014 to 2021. ENVIRONMENTAL POLLUTION (BARKING, ESSEX : 1987) 2022; 306:119420. [PMID: 35526642 DOI: 10.1016/j.envpol.2022.119420] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/01/2022] [Revised: 04/16/2022] [Accepted: 05/02/2022] [Indexed: 05/16/2023]
Abstract
China was seriously affected by air pollution in the past decade, especially for particulate matter (PM) and emerging ozone pollution recently. In this study, we systematically examined the spatiotemporal variations of six air pollutants and conducted ozone prediction using machine learning (ML) algorithms in the Beijing-Tianjin-Hebei (BTH) region. The annual-average concentrations of CO, PM10, PM2.5 and SO2 decreased at a rate of 141, 11.0, 6.6 and 5.6 μg/m3/year, while a pattern of initial increase and later decrease was observed for NO2 and O3_8 h. The concentration of SO2, CO and NO2 was higher in Tangshan and Xingtai, while northern BTH region has lower levels of CO, NO2 and PM. Spatial variations of ozone were relatively small in the BTH region. Monthly variations of PM10 displayed an increase in March probably due to wind-blown dusts from Northwest China. A seasonal and diurnal pattern with summer and afternoon peaks was found for ozone, which was contrast with other pollutants. Further ML algorithms such as Random Forest (RF) model and Decision tree (DT) regression showed good ozone prediction performance (daily: R2 = 0.83 and 0.73, RMSE = 30.0 and 37.3 μg/m3, respectively; monthly: R2 = 0.93 and 0.88, RMSE = 12.1 and 15.8 μg/m3, respectively) based on 10-fold cross-validation. Both RF model and DT regression relied more on the spatial trend as higher temporal prediction performance was achieved. Solar radiation- and temperature-related variables presented high importance at daily level, whereas sea level pressure dominated at monthly level. The spatiotemporal heterogeneity in variable importance was further confirmed using case studies based on RF model. In addition, variable importance was possibly influenced by the emission reductions due to COVID-19 pandemic. Despite its possible weakness to capture ozone extremes, RF model was beneficial and suggested for predicting spatiotemporal variations of ozone in future studies.
Collapse
Affiliation(s)
- Yan Lyu
- College of Environment, Zhejiang University of Technology, Hangzhou, 310032, China
| | - Qinru Ju
- School of Accounting, Southwestern University of Finance and Economics, Chengdu, 611130, China
| | - Fengmao Lv
- School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu, 611756, China
| | - Jialiang Feng
- School of Environmental and Chemical Engineering, Shanghai University, Shanghai, 200444, China
| | - Xiaobing Pang
- College of Environment, Zhejiang University of Technology, Hangzhou, 310032, China.
| | - Xiang Li
- Department of Environmental Science & Engineering, Fudan University, Shanghai, 200438, China
| |
Collapse
|
11
|
Sannigrahi S, Pilla F, Maiti A, Bar S, Bhatt S, Kaparwan A, Zhang Q, Keesstra S, Cerda A. Examining the status of forest fire emission in 2020 and its connection to COVID-19 incidents in West Coast regions of the United States. ENVIRONMENTAL RESEARCH 2022; 210:112818. [PMID: 35104482 PMCID: PMC8800502 DOI: 10.1016/j.envres.2022.112818] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/18/2021] [Revised: 01/10/2022] [Accepted: 01/19/2022] [Indexed: 05/30/2023]
Abstract
Forest fires impact on soil, water, and biota resources. The current forest fires in the West Coast of the United States (US) profoundly impacted the atmosphere and air quality across the ecosystems and have caused severe environmental and public health burdens. Forest fire led emissions could significantly exacerbate the air pollution level and, therefore, would play a critical role if the same occurs together with any epidemic and pandemic health crisis. Limited research is done so far to examine its impact in connection to the current pandemic. As of October 21, nearly 8.2 million acres of forest area were burned, with more than 25 casualties reported so far. In-situ air pollution data were utilized to examine the effects of the 2020 forest fire on atmosphere and coronavirus (COVID-19) casualties. The spatial-temporal concentrations of particulate matter (PM2.5 and PM10) and Nitrogen Dioxide (NO2) were collected from August 1 to October 30 for 2020 (the fire year) and 2019 (the reference year). Both spatial (Multiscale Geographically Weighted Regression) and non-spatial (Negative Binomial Regression) analyses were performed to assess the adverse effects of fire emission on human health. The in-situ data-led measurements showed that the maximum increases in PM2.5, PM10, and NO2 concentrations (μg/m3) were clustered in the West Coastal fire-prone states during August 1 - October 30, 2020. The average concentration (μg/m3) of particulate matter (PM2.5 and PM10) and NO2 was increased in all the fire states severely affected by forest fires. The average PM2.5 concentrations (μg/m3) over the period were recorded as 7.9, 6.3, 5.5, and 5.2 for California, Colorado, Oregon, and Washington in 2019, increasing up to 24.9, 13.4, 25.0, and 17.0 in 2020. Both spatial and non-spatial regression models exhibited a statistically significant association between fire emission and COVID-19 incidents. Such association has been demonstrated robust and stable by a total of 30 models developed for analyzing the spatial non-stationary and local association. More in-depth research is needed to better understand the complex relationship between forest fire emission and human health.
Collapse
Affiliation(s)
- Srikanta Sannigrahi
- School of Architecture, Planning and Environmental Policy, University College Dublin Richview, Clonskeagh, Dublin, D14 E099, Ireland.
| | - Francesco Pilla
- School of Architecture, Planning and Environmental Policy, University College Dublin Richview, Clonskeagh, Dublin, D14 E099, Ireland
| | - Arabinda Maiti
- Department of Geography, Vidyasagar University, Midnapore, West Bengal, India
| | - Somnath Bar
- Department of Geoinformatics, Central University of Jharkhand, Ranchi, India
| | - Sandeep Bhatt
- Department of Earth Sciences, Indian Institute of Technology Roorkee, India
| | - Ankit Kaparwan
- Department of Statistics, Hemvati Nandan Bahuguna Garhwal University, Srinagar, India
| | - Qi Zhang
- Department of Geography, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Saskia Keesstra
- Team Soil, Water and Land Use, Wageningen Environmental Research, Wageningen University & Research, Wageningen, Netherlands; Civil, Surveying and Environmental Engineering and Centre for Water Security and Environmental Sustainability, The University of Newcastle, Callaghan, 2308, Australia
| | - Artemi Cerda
- Soil Erosion and Degradation Research Group, Department of Geography, Valencia University, Blasco Ibàñez, 28, 46010, Valencia, Spain
| |
Collapse
|
12
|
Balogun AL, Tella A. Modelling and investigating the impacts of climatic variables on ozone concentration in Malaysia using correlation analysis with random forest, decision tree regression, linear regression, and support vector regression. CHEMOSPHERE 2022; 299:134250. [PMID: 35318016 DOI: 10.1016/j.chemosphere.2022.134250] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/21/2021] [Revised: 12/01/2021] [Accepted: 03/05/2022] [Indexed: 06/14/2023]
Abstract
Climate change is generally known to impact ozone concentration globally. However, the intensity varies across regions and countries. Therefore, local studies are essential to accurately assess the correlation of climate change and ozone concentration in different countries. This study investigates the effects of climatic variables on ozone concentration in Malaysia in order to understand the nexus between climate change and ozone concentration. The selected data was obtained from ten (10) air monitoring stations strategically mounted in urban-industrial and residential areas with significant emissions of pollutants. Correlation analysis and four machine learning algorithms (random forest, decision tree regression, linear regression, and support vector regression) were used to analyze ozone and meteorological dataset in the study area. The analysis was carried out during the southwest monsoon due to the rise of ozone in the dry season. The results show a very strong correlation between temperature and ozone. Wind speed also exhibits a moderate to strong correlation with ozone, while relative humidity is negatively correlated. The highest correlation values were obtained at Bukit Rambai, Nilai, Jaya II Perai, Ipoh, Klang and Petaling Jaya. These locations have high industries and are well urbanized. The four machine learning algorithms exhibit high predictive performances, generally ascertaining the predictive accuracy of the climatic variables. The random forest outperformed other algorithms with a very high R2 of 0.970, low RMSE of 2.737 and MAE of 1.824, followed by linear regression, support vector regression and decision tree regression, respectively. This study's outcome indicates a linkage between temperature and wind speed with ozone concentration in the study area. An increase of these variables will likely increase the ozone concentration posing threats to lives and the environment. Therefore, this study provides data-driven insights for decision-makers and other stakeholders in ensuring good air quality for sustainable cities and communities. It also serves as a guide for the government for necessary climate actions to reduce the effect of climate change on air pollution and enabling sustainable cities in accordance with the UN's SDGs 13 and 11, respectively.
Collapse
Affiliation(s)
- Abdul-Lateef Balogun
- Professional Services Department (Resources), Esri Australia, 613 King Street, West Melbourne, VIC, 3003, Australia; Geospatial Analysis and Modelling (GAM) Research Laboratory, Department of Civil and Environmental Engineering, Universiti Teknologi PETRONAS (UTP), 32610, Seri Iskandar, Perak, Malaysia
| | - Abdulwaheed Tella
- Earth, Environment and Space Division, Foresight Institute of Research and Translation, Ibadan, Nigeria; Geospatial Analysis and Modelling (GAM) Research Laboratory, Department of Civil and Environmental Engineering, Universiti Teknologi PETRONAS (UTP), 32610, Seri Iskandar, Perak, Malaysia.
| |
Collapse
|
13
|
Ren X, Mi Z, Cai T, Nolte CG, Georgopoulos PG. Flexible Bayesian Ensemble Machine Learning Framework for Predicting Local Ozone Concentrations. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2022; 56:3871-3883. [PMID: 35312316 PMCID: PMC9133919 DOI: 10.1021/acs.est.1c04076] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
3D-grid-based chemical transport models, such as the Community Multiscale Air Quality (CMAQ) modeling system, have been widely used for predicting concentrations of ambient air pollutants. However, typical horizontal resolutions of nationwide CMAQ simulations (12 × 12 km2) cannot capture local-scale gradients for accurately assessing human exposures and environmental justice disparities. In this study, a Bayesian ensemble machine learning (BEML) framework, which integrates 13 learning algorithms, was developed for downscaling CMAQ estimates of ozone daily maximum 8 h averages to the census tract level, across the contiguous US, and was demonstrated for 2011. Three-stage hyperparameter tuning and targeted validations were designed to ensure the ensemble model's ability to interpolate, extrapolate, and capture concentration peaks. The Shapley value metric from coalitional game theory was applied to interpret the drivers of subgrid gradients. The flexibility (transferability) of the 2011-trained BEML model was further tested by evaluating its ability to estimate fine-scale concentrations for other years (2012-2017) without retraining. To demonstrate the feasibility of using the BEML approach to strictly "data-limited" situations, the model was applied to downscale CMAQ outputs for a future-year scenario-based simulation that considers effects of variations in meteorology associated with climate change.
Collapse
Affiliation(s)
- Xiang Ren
- Environmental and Occupational Health Sciences Institute (EOHSI), Rutgers University, Piscataway, NJ 08854, USA
- Department of Chemical and Biochemical Engineering, Rutgers University, Piscataway, NJ 08854, USA
| | - Zhongyuan Mi
- Environmental and Occupational Health Sciences Institute (EOHSI), Rutgers University, Piscataway, NJ 08854, USA
- Department of Environmental Sciences, Rutgers University, New Brunswick, NJ 08901, USA
| | - Ting Cai
- Environmental and Occupational Health Sciences Institute (EOHSI), Rutgers University, Piscataway, NJ 08854, USA
| | - Christopher G. Nolte
- Center for Environmental Measurement and Modeling, U.S. Environmental Protection Agency, Research Triangle Park, NC 27711, USA
| | - Panos G. Georgopoulos
- Environmental and Occupational Health Sciences Institute (EOHSI), Rutgers University, Piscataway, NJ 08854, USA
- Department of Chemical and Biochemical Engineering, Rutgers University, Piscataway, NJ 08854, USA
- Department of Environmental Sciences, Rutgers University, New Brunswick, NJ 08901, USA
- Department of Environmental and Occupational Health and Justice, Rutgers School of Public Health, Piscataway, NJ 08854, USA
| |
Collapse
|
14
|
A Systematic Review of Applications of Machine Learning Techniques for Wildfire Management Decision Support. INVENTIONS 2022. [DOI: 10.3390/inventions7010015] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Wildfires threaten and kill people, destroy urban and rural property, degrade air quality, ravage forest ecosystems, and contribute to global warming. Wildfire management decision support models are thus important for avoiding or mitigating the effects of these events. In this context, this paper aims at providing a review of recent applications of machine learning methods for wildfire management decision support. The emphasis is on providing a summary of these applications with a classification according to the case study type, machine learning method, case study location, and performance metrics. The review considers documents published in the last four years, using a sample of 135 documents (review articles and research articles). It is concluded that the adoption of machine learning methods may contribute to enhancing support in different fire management phases.
Collapse
|
15
|
Wang W, Liu X, Bi J, Liu Y. A machine learning model to estimate ground-level ozone concentrations in California using TROPOMI data and high-resolution meteorology. ENVIRONMENT INTERNATIONAL 2022; 158:106917. [PMID: 34624589 DOI: 10.1016/j.envint.2021.106917] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Revised: 09/30/2021] [Accepted: 10/01/2021] [Indexed: 05/25/2023]
Abstract
Estimating ground-level ozone concentrations is crucial to study the adverse health effects of ozone exposure and better understand the impacts of ground-level ozone on biodiversity and vegetation. However, few studies have attempted to use satellite retrieved ozone as an indicator given their low sensitivity in the boundary layer. Using the Troposphere Monitoring Instrument (TROPOMI)'s total ozone column together with the ozone profile information retrieved by the Ozone Monitoring Instrument (OMI), as TROPOMI ozone profile product has not been released, we developed a machine learning model to estimate daily maximum 8-hour average ground-level ozone concentration at 10 km spatial resolution in California. In addition to satellite parameters, we included meteorological fields from the High-Resolution Rapid Refresh (HRRR) system at 3 km resolution and land-use information as predictors. Our model achieved an overall 10-fold cross-validation (CV) R2 of 0.84 with root mean square error (RMSE) of 0.0059 ppm, indicating a good agreement between model predictions and observations. Model predictions showed that the suburb of Los Angeles Metropolitan area had the highest ozone levels, while the Bay Area and the Pacific coast had the lowest. High ozone levels are also seen in Southern California and along the east side of the Central Valley. TROPOMI data improved the estimate of extreme values when compared to a similar model without it. Our study demonstrates the feasibility and value of using TROPOMI data in the spatiotemporal characterization of ground-level ozone concentration.
Collapse
Affiliation(s)
- Wenhao Wang
- Gangarosa Department of Environmental Health, Rollins School of Public Health, Emory University, Atlanta, GA, USA
| | - Xiong Liu
- Harvard-Smithsonian Center for Astrophysics, Cambridge, MA, USA
| | - Jianzhao Bi
- Department of Environmental & Occupational Health Sciences, School of Public Health, University of Washington, Seattle, WA, USA
| | - Yang Liu
- Gangarosa Department of Environmental Health, Rollins School of Public Health, Emory University, Atlanta, GA, USA.
| |
Collapse
|
16
|
Jain S, Presto AA, Zimmerman N. Spatial Modeling of Daily PM 2.5, NO 2, and CO Concentrations Measured by a Low-Cost Sensor Network: Comparison of Linear, Machine Learning, and Hybrid Land Use Models. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2021; 55:8631-8641. [PMID: 34133134 DOI: 10.1021/acs.est.1c02653] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Previous studies have characterized spatial patterns of pollution with land use regression (LUR) models from distributed passive or filter samplers at low temporal resolution. Large-scale deployment of low-cost sensors (LCS), which typically sample in real time, may enable time-resolved or real-time modeling of concentration surfaces. The aim of this study was to develop spatiotemporal models of PM2.5, NO2, and CO using an LCS network in Pittsburgh, Pennsylvania. We modeled daily average concentrations in August 2016-December 2017 across 50 sites. Land use variables included 13 time-independent (e.g., elevation) and time-dependent (e.g., temperature) predictors. We examined two models: LUR and a machine-learning-enabled land use model (land use random forest, LURF). The LURF models outperformed LUR models, with increase in the average externally cross-validated R2 of 0.10-0.19. Using wavelet decomposition to separate short-lived events from the regional background, we also created time-decomposed LUR and LURF models. Compared to the standard model, this resulted in improvement in R2 of up to 0.14. The time-decomposed models were more influenced by spatial parameters. Mapping our models across Allegheny County, we observed that time-decomposed LURF models created robust PM2.5 predictions, suggesting that this approach may improve our ability to map air pollutants at high spatiotemporal resolution.
Collapse
Affiliation(s)
- Sakshi Jain
- Department of Mechanical Engineering, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada
| | - Albert A Presto
- Department of Mechanical Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States
| | - Naomi Zimmerman
- Department of Mechanical Engineering, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada
| |
Collapse
|
17
|
Enebish T, Chau K, Jadamba B, Franklin M. Predicting ambient PM 2.5 concentrations in Ulaanbaatar, Mongolia with machine learning approaches. JOURNAL OF EXPOSURE SCIENCE & ENVIRONMENTAL EPIDEMIOLOGY 2021; 31:699-708. [PMID: 32747729 PMCID: PMC9871862 DOI: 10.1038/s41370-020-0257-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/23/2020] [Revised: 07/22/2020] [Accepted: 07/23/2020] [Indexed: 05/06/2023]
Abstract
BACKGROUND Accurately assessing individual ambient air pollution exposure is a crucial part of epidemiological studies looking at the adverse health effect of poor air quality. This is particularly challenging in developing countries with high levels of air pollution, mostly due to sparse monitoring networks with a lack of consistent data. METHODS We evaluated the performance of six different machine learning algorithms in predicting fine particulate matter (PM2.5) concentrations in Ulaanbaatar, Mongolia using data between 2010 and 2018. We found that the algorithms produce robust results based on performance metrics. RESULTS Random forest (RF) and gradient boosting models performed the best with leave-one-location-out cross-validated R2 of 0.82 for when using data from the entire study period. After applying tuned models on the hold-out test set, R2 increased to 0.96 for the RF and 0.90 for the gradient boosting model. We also predicted PM2.5 concentrations for each administrative area (khoroo) of the city using RF and maps of predictions show spatiotemporal variations that are in line with the location of the high-emission area (ger district), city center, and population density. CONCLUSION Our results provide evidence of the advantage and feasibility of machine learning approaches in predicting ambient PM2.5 levels in a setting with limited resources and extreme air pollution levels.
Collapse
Affiliation(s)
- Temuulen Enebish
- Department of Preventive Medicine, University of Southern California, Los Angeles, CA, 90032, United States.
| | - Khang Chau
- Department of Preventive Medicine, University of Southern California, Los Angeles, CA, 90032, United States
| | - Batbayar Jadamba
- Department of Environmental Monitoring, National Agency for Meteorology and Environmental Monitoring, Ulaanbaatar, Mongolia
| | - Meredith Franklin
- Department of Preventive Medicine, University of Southern California, Los Angeles, CA, 90032, United States
| |
Collapse
|
18
|
Reid CE, Considine EM, Maestas MM, Li G. Daily PM 2.5 concentration estimates by county, ZIP code, and census tract in 11 western states 2008-2018. Sci Data 2021; 8:112. [PMID: 33875665 PMCID: PMC8055869 DOI: 10.1038/s41597-021-00891-1] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Accepted: 03/04/2021] [Indexed: 11/20/2022] Open
Abstract
We created daily concentration estimates for fine particulate matter (PM2.5) at the centroids of each county, ZIP code, and census tract across the western US, from 2008-2018. These estimates are predictions from ensemble machine learning models trained on 24-hour PM2.5 measurements from monitoring station data across 11 states in the western US. Predictor variables were derived from satellite, land cover, chemical transport model (just for the 2008-2016 model), and meteorological data. Ten-fold spatial and random CV R2 were 0.66 and 0.73, respectively, for the 2008-2016 model and 0.58 and 0.72, respectively for the 2008-2018 model. Comparing areal predictions to nearby monitored observations demonstrated overall R2 of 0.70 for the 2008-2016 model and 0.58 for the 2008-2018 model, but we observed higher R2 (>0.80) in many urban areas. These data can be used to understand spatiotemporal patterns of, exposures to, and health impacts of PM2.5 in the western US, where PM2.5 levels have been heavily impacted by wildfire smoke over this time period.
Collapse
Affiliation(s)
- Colleen E Reid
- Geography Department, Campus Box 260, University of Colorado Boulder, Boulder, CO, 80309, USA.
- Earth Lab, 4001 Discovery Drive Suite S348 - UCB 611, University of Colorado Boulder, Boulder, CO, 80309, USA.
- Institute of Behavioral Sciences, 483 UCB, University of Colorado Boulder, Boulder, CO, 80309, USA.
| | - Ellen M Considine
- Earth Lab, 4001 Discovery Drive Suite S348 - UCB 611, University of Colorado Boulder, Boulder, CO, 80309, USA
- Applied Mathematics Department, Engineering Center, ECOT 225, 526 UCB, University of Colorado Boulder, Boulder, CO, 80309, USA
| | - Melissa M Maestas
- Earth Lab, 4001 Discovery Drive Suite S348 - UCB 611, University of Colorado Boulder, Boulder, CO, 80309, USA
| | - Gina Li
- Geography Department, Campus Box 260, University of Colorado Boulder, Boulder, CO, 80309, USA
- Earth Lab, 4001 Discovery Drive Suite S348 - UCB 611, University of Colorado Boulder, Boulder, CO, 80309, USA
| |
Collapse
|
19
|
Development of machine learning model for diagnostic disease prediction based on laboratory tests. Sci Rep 2021; 11:7567. [PMID: 33828178 PMCID: PMC8026627 DOI: 10.1038/s41598-021-87171-5] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2020] [Accepted: 03/19/2021] [Indexed: 01/16/2023] Open
Abstract
The use of deep learning and machine learning (ML) in medical science is increasing, particularly in the visual, audio, and language data fields. We aimed to build a new optimized ensemble model by blending a DNN (deep neural network) model with two ML models for disease prediction using laboratory test results. 86 attributes (laboratory tests) were selected from datasets based on value counts, clinical importance-related features, and missing values. We collected sample datasets on 5145 cases, including 326,686 laboratory test results. We investigated a total of 39 specific diseases based on the International Classification of Diseases, 10th revision (ICD-10) codes. These datasets were used to construct light gradient boosting machine (LightGBM) and extreme gradient boosting (XGBoost) ML models and a DNN model using TensorFlow. The optimized ensemble model achieved an F1-score of 81% and prediction accuracy of 92% for the five most common diseases. The deep learning and ML models showed differences in predictive power and disease classification patterns. We used a confusion matrix and analyzed feature importance using the SHAP value method. Our new ML model achieved high efficiency of disease prediction through classification of diseases. This study will be useful in the prediction and diagnosis of diseases.
Collapse
|
20
|
Estimation of Lower-Stratosphere-to-Troposphere Ozone Profile Using Long Short-Term Memory (LSTM). REMOTE SENSING 2021. [DOI: 10.3390/rs13071374] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Climate change and air pollution are emerging topics due to their possible enormous implications for health and social perspectives. In recent years, tropospheric ozone has been recognized as an important greenhouse gas and pollutant that is detrimental to human health, agriculture, and natural ecosystems, and has shown a trend of increasing interest. Machine-learning-based approaches have been widely applied to the estimation of tropospheric ozone concentrations, but few studies have included tropospheric ozone profiles. This study aimed to predict the Northern Hemisphere distribution of Lower-Stratosphere-to-Troposphere (LST) ozone at a pressure of 100 hPa to the near surface by employing a deep learning Long Short-Term Memory (LSTM) model. We referred to a history of all the observed parameters (meteorological data of European Centre for Medium-Range Weather Forecasts (ECMWF) Reanalysis v5 (ERA5), satellite data, and the ozone profiles of the World Ozone and Ultraviolet Data Center (WOUDC)) between 2014 and 2018 for training the predictive models. Model–measurement comparisons for the monitoring sites of WOUDC for the period 2019–2020 show that the mean correlation coefficients (R2) in the Northern Hemisphere at high latitude (NH), Northern Hemisphere at middle latitude (NM), and Northern Hemisphere at low latitude (NL) are 0.928, 0.885, and 0.590, respectively, indicating reasonable performance for the LSTM forecasting model. To improve the performance of the model, we applied the LSTM migration models to the Civil Aircraft for the Regular Investigation of the Atmosphere Based on an Instrument Container (CARIBIC) flights in the Northern Hemisphere from 2018 to 2019 and three urban agglomerations (the Sichuan Basin (SCB), North China Plain (NCP), and Yangtze River Delta region (YRD)) between 2018 and 2019. The results show that our models performed well on the CARIBIC data set, with a high R2 equal to 0.754. The daily and monthly surface ozone concentrations for 2018–2019 in the three urban agglomerations were estimated from meteorological and ancillary variables. Our results suggest that the LSTM models can accurately estimate the monthly surface ozone concentrations in the three clusters, with relatively high coefficients of 0.815–0.889, root mean square errors (RMSEs) of 7.769–8.729 ppb, and mean absolute errors (MAEs) of 6.111–6.930 ppb. The daily scale performance was not as high as the monthly scale performance, with the accuracy of R2 = 0.636~0.737, RMSE = 14.543–16.916 ppb, MAE = 11.130–12.687 ppb. In general, the trained module based on LSTM is robust and can capture the variation of the atmospheric ozone distribution. Moreover, it also contributes to our understanding of the mechanism of air pollution, especially increasing our comprehension of pollutant areas.
Collapse
|
21
|
Aguilera R, Corringham T, Gershunov A, Benmarhnia T. Wildfire smoke impacts respiratory health more than fine particles from other sources: observational evidence from Southern California. Nat Commun 2021; 12:1493. [PMID: 33674571 PMCID: PMC7935892 DOI: 10.1038/s41467-021-21708-0] [Citation(s) in RCA: 140] [Impact Index Per Article: 46.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2020] [Accepted: 02/03/2021] [Indexed: 01/31/2023] Open
Abstract
Wildfires are becoming more frequent and destructive in a changing climate. Fine particulate matter, PM2.5, in wildfire smoke adversely impacts human health. Recent toxicological studies suggest that wildfire particulate matter may be more toxic than equal doses of ambient PM2.5. Air quality regulations however assume that the toxicity of PM2.5 does not vary across different sources of emission. Assessing whether PM2.5 from wildfires is more or less harmful than PM2.5 from other sources is a pressing public health concern. Here, we isolate the wildfire-specific PM2.5 using a series of statistical approaches and exposure definitions. We found increases in respiratory hospitalizations ranging from 1.3 to up to 10% with a 10 μg m-3 increase in wildfire-specific PM2.5, compared to 0.67 to 1.3% associated with non-wildfire PM2.5. Our conclusions point to the need for air quality policies to consider the variability in PM2.5 impacts on human health according to the sources of emission.
Collapse
Affiliation(s)
- Rosana Aguilera
- grid.266100.30000 0001 2107 4242Scripps Institution of Oceanography, University of California San Diego, La Jolla, CA USA
| | - Thomas Corringham
- grid.266100.30000 0001 2107 4242Scripps Institution of Oceanography, University of California San Diego, La Jolla, CA USA
| | - Alexander Gershunov
- grid.266100.30000 0001 2107 4242Scripps Institution of Oceanography, University of California San Diego, La Jolla, CA USA
| | - Tarik Benmarhnia
- grid.266100.30000 0001 2107 4242Scripps Institution of Oceanography, University of California San Diego, La Jolla, CA USA ,grid.266100.30000 0001 2107 4242Herbert Wertheim School of Public Health and Human Longevity Science, University of California San Diego, La Jolla, CA USA
| |
Collapse
|
22
|
Feng R, Huang CC, Luo K, Zheng HJ. Deciphering wintertime air pollution upon the West Lake of Hangzhou, China. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2021. [DOI: 10.3233/jifs-201964] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
The West Lake of Hangzhou, a world famous landscape and cultural symbol of China, suffered from severe air quality degradation in January 2015. In this work, Random Forest (RF) and Recurrent Neural Networks (RNN) are used to analyze and predict air pollutants on the central island of the West Lake. We quantitatively demonstrate that the PM2.5 and PM10 were chiefly associated by the ups and downs of the gaseous air pollutants (SO2, NO2 and CO). Compared with the gaseous air pollutants, meteorological circumstances and regional transport played trivial roles in shaping PM. The predominant meteorological factor for SO2, NO2 and surface O3 was dew-point deficit. The proportion of sulfate in PM10 was higher than that in PM2.5. CO was strongly positively linked with PM. We discover that machine learning can accurately predict daily average wintertime SO2, NO2, PM2.5 and PM10, casting new light on the forecast and early warning of the high episodes of air pollutants in the future.
Collapse
Affiliation(s)
- Rui Feng
- State Key Laboratory of Clean Energy Utilization, Zhejiang University, Hangzhou, P. R. China
- Hangzhou Engineering Consulting Center Co., Ltd, Hangzhou, P. R. China
- Zhejiang Academy of Ecological and Environmental Sciences, Hangzhou, P. R. China
- Hangzhou Knowledge Chain Technology Co., Ltd, Hangzhou, P. R. China
| | - Cheng-Chen Huang
- Hangzhou Municipal Environmental Monitoring Central Station, Hangzhou, P. R. China
| | - Kun Luo
- State Key Laboratory of Clean Energy Utilization, Zhejiang University, Hangzhou, P. R. China
| | - Hui-Jun Zheng
- Department of Critical Care Medicine, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, P. R. China
| |
Collapse
|
23
|
Nabavi SO, Nölscher AC, Samimi C, Thomas C, Haimberger L, Lüers J, Held A. Site-scale modeling of surface ozone in Northern Bavaria using machine learning algorithms, regional dynamic models, and a hybrid model. ENVIRONMENTAL POLLUTION (BARKING, ESSEX : 1987) 2021; 268:115736. [PMID: 33120341 DOI: 10.1016/j.envpol.2020.115736] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/08/2020] [Revised: 09/23/2020] [Accepted: 09/24/2020] [Indexed: 06/11/2023]
Abstract
Ozone (O3) is a harmful pollutant when present in the lowermost layer of the atmosphere. Therefore, the European Commission formulated directives to regulate O3 concentrations in near-surface air. However, almost 50% of the 5068 air quality stations in Europe do not monitor O3 concentrations. This study aims to provide a hybrid modeling system that fills these gaps in the hourly surface O3 observations on a site scale with much higher accuracy than existing O3 models. This hybrid model was developed using estimations from multiple linear regression-based eXtreme Gradient Boosting Machines (MLR-XGBM) and O3 reanalysis from European regional air quality models (CAMS-EU). The binary classification of extremely high O3 events and the 1- and 24-h forecasts of hourly O3 were investigated as secondary aims. In this study thirteen stations in Northern Bavaria, out of which six do not monitor O3, were chosen as test sites. Considering the computational complexity of machine learning algorithms (MLAs), we also applied two recent MLA interpretation methods, namely SHapley Additive exPlanations (SHAP) and Local interpretable model-agnostic explanations (LIME). With SHAP, we showed an increasing effect of temperature on O3 concentrations which intensifies for temperatures exceeding 17 °C. According to LIME, O3 concentration peaks are mainly governed by meteorological factors under dry and warm conditions on a regional scale, whereas local nitrogen oxide concentrations control base O3 concentrations during cold and wet periods. While recently developed MLAs for the spatial estimation of hourly O3 concentrations had a station-based root-mean-square error (RMSE) above 27 μg/m3, our proposed model significantly reduced the estimation errors by about 66% with an RMSE of 9.49 μg/m3. We also found that logistic regression (LR) and MLR-XGBM performed best in the site-scale classification and 24-h forecast of O3 concentrations (with a station-averaged accuracy and RMSE of 0.95 and 19.34 μg/m3, respectively).
Collapse
Affiliation(s)
- Seyed Omid Nabavi
- Climatology Group, University of Bayreuth, Bayreuth, Germany; BayCEER, University of Bayreuth, Bayreuth, Germany.
| | - Anke C Nölscher
- BayCEER, University of Bayreuth, Bayreuth, Germany; Atmospheric Chemistry Group, University of Bayreuth, Bayreuth, Germany
| | - Cyrus Samimi
- Climatology Group, University of Bayreuth, Bayreuth, Germany; BayCEER, University of Bayreuth, Bayreuth, Germany
| | - Christoph Thomas
- BayCEER, University of Bayreuth, Bayreuth, Germany; Micrometeorology Group, University of Bayreuth, Bayreuth, Germany
| | - Leopold Haimberger
- Department of Meteorology and Geophysics, University of Vienna, Vienna, Austria
| | - Johannes Lüers
- BayCEER, University of Bayreuth, Bayreuth, Germany; Micrometeorology Group, University of Bayreuth, Bayreuth, Germany
| | - Andreas Held
- BayCEER, University of Bayreuth, Bayreuth, Germany; Chair of Environmental Chemistry and Air Quality, Department of Environmental Science and Technology, TU Berlin, Germany
| |
Collapse
|
24
|
Estimation of PM2.5 Concentrations in New York State: Understanding the Influence of Vertical Mixing on Surface PM2.5 Using Machine Learning. ATMOSPHERE 2020. [DOI: 10.3390/atmos11121303] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
In New York State (NYS), episodic high fine particulate matter (PM2.5) concentrations associated with aerosols originated from the Midwest, Mid-Atlantic, and Pacific Northwest states have been reported. In this study, machine learning techniques, including multiple linear regression (MLR) and artificial neural network (ANN), were used to estimate surface PM2.5 mass concentrations at air quality monitoring sites in NYS during the summers of 2016–2019. Various predictors were considered, including meteorological, aerosol, and geographic predictors. Vertical predictors, designed as the indicators of vertical mixing and aloft aerosols, were also applied. Overall, the ANN models performed better than the MLR models, and the application of vertical predictors generally improved the accuracy of PM2.5 estimation of the ANN models. The leave-one-out cross-validation results showed significant cross-site variations and were able to present the different predictor-PM2.5 correlations at the sites with different PM2.5 characteristics. In addition, a joint analysis of regression coefficients from the MLR model and variable importance from the ANN model provided insights into the contributions of selected predictors to PM2.5 concentrations. The improvements in model performance due to aloft aerosols were relatively minor, probably due to the limited cases of aloft aerosols in current datasets.
Collapse
|
25
|
Abstract
The concentration of surface ozone (O3) strongly depends on environmental and meteorological variables through a series of complex and non-linear functions. This study aims to explore the performances of an advanced machine learning (ML) method, the boosted regression trees (BRT) technique, in exploring the relationships between surface O3 and its driving factors, and in predicting the levels of O3 concentrations. To this end, a BRT model was trained on hourly data of air pollutants and meteorological parameters, acquired, over the 2016–2018 period, in a rural area affected by an anthropic source of air pollutants. The abilities of the BRT model in ranking, visualizing, and predicting the relationship between ground-level O3 concentrations and its driving factors were analyzed and illustrated. A comparison with a multiple linear regression (MLR) model was performed based on several statistical indicators. The results obtained indicated that the BRT model was able to account for 81% of changes in O3 concentrations; it slightly outperforms the MLR model in terms of the predictions accuracy and allows a better identification of the main factors influencing O3 variability on a local scale. This knowledge is expected to be useful in defining effective measures to prevent and/or mitigate the health damages associated with O3 exposure.
Collapse
|
26
|
Band SS, Janizadeh S, Chandra Pal S, Saha A, Chakrabortty R, Shokri M, Mosavi A. Novel Ensemble Approach of Deep Learning Neural Network (DLNN) Model and Particle Swarm Optimization (PSO) Algorithm for Prediction of Gully Erosion Susceptibility. SENSORS (BASEL, SWITZERLAND) 2020; 20:E5609. [PMID: 33008132 PMCID: PMC7582716 DOI: 10.3390/s20195609] [Citation(s) in RCA: 70] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/11/2020] [Revised: 09/22/2020] [Accepted: 09/24/2020] [Indexed: 11/16/2022]
Abstract
This study aims to evaluate a new approach in modeling gully erosion susceptibility (GES) based on a deep learning neural network (DLNN) model and an ensemble particle swarm optimization (PSO) algorithm with DLNN (PSO-DLNN), comparing these approaches with common artificial neural network (ANN) and support vector machine (SVM) models in Shirahan watershed, Iran. For this purpose, 13 independent variables affecting GES in the study area, namely, altitude, slope, aspect, plan curvature, profile curvature, drainage density, distance from a river, land use, soil, lithology, rainfall, stream power index (SPI), and topographic wetness index (TWI), were prepared. A total of 132 gully erosion locations were identified during field visits. To implement the proposed model, the dataset was divided into the two categories of training (70%) and testing (30%). The results indicate that the area under the curve (AUC) value from receiver operating characteristic (ROC) considering the testing datasets of PSO-DLNN is 0.89, which indicates superb accuracy. The rest of the models are associated with optimal accuracy and have similar results to the PSO-DLNN model; the AUC values from ROC of DLNN, SVM, and ANN for the testing datasets are 0.87, 0.85, and 0.84, respectively. The efficiency of the proposed model in terms of prediction of GES was increased. Therefore, it can be concluded that the DLNN model and its ensemble with the PSO algorithm can be used as a novel and practical method to predict gully erosion susceptibility, which can help planners and managers to manage and reduce the risk of this phenomenon.
Collapse
Affiliation(s)
- Shahab S. Band
- Institute of Research and Development, Duy Tan University, Da Nang 550000, Vietnam
- Future Technology Research Center, National Yunlin University of Science and Technology, 123 University Road, Section 3, Douliou, Yunlin 64002, Taiwan
| | - Saeid Janizadeh
- Department of Watershed Management Engineering and Sciences, Faculty in Natural Resources and Marine Science, Tarbiat Modares University, 14115-111 Tehran, Iran;
| | - Subodh Chandra Pal
- Department of Geography, The University of Burdwan, West Bengal, Burdwan 713104, India; (S.C.P.); (A.S.); (R.C.)
| | - Asish Saha
- Department of Geography, The University of Burdwan, West Bengal, Burdwan 713104, India; (S.C.P.); (A.S.); (R.C.)
| | - Rabin Chakrabortty
- Department of Geography, The University of Burdwan, West Bengal, Burdwan 713104, India; (S.C.P.); (A.S.); (R.C.)
| | - Manouchehr Shokri
- Institute of Structural Mechanics, Bauhaus Universität Weimar, 99423 Weimar, Germany;
| | - Amirhosein Mosavi
- Environmental Quality, Atmospheric Science and Climate Change Research Group, Ton Duc ThangUniversity, Ho Chi Minh City 700000, Vietnam;
- Faculty of Environment and Labour Safety, Ton Duc Thang University, Ho Chi Minh City 700000, Vietnam
| |
Collapse
|
27
|
Requia WJ, Di Q, Silvern R, Kelly JT, Koutrakis P, Mickley LJ, Sulprizio MP, Amini H, Shi L, Schwartz J. An Ensemble Learning Approach for Estimating High Spatiotemporal Resolution of Ground-Level Ozone in the Contiguous United States. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2020; 54:11037-11047. [PMID: 32808786 PMCID: PMC7498146 DOI: 10.1021/acs.est.0c01791] [Citation(s) in RCA: 86] [Impact Index Per Article: 21.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/17/2023]
Abstract
In this paper, we integrated multiple types of predictor variables and three types of machine learners (neural network, random forest, and gradient boosting) into a geographically weighted ensemble model to estimate the daily maximum 8 h O3 with high resolution over both space (at 1 km × 1 km grid cells covering the contiguous United States) and time (daily estimates between 2000 and 2016). We further quantify monthly model uncertainty for our 1 km × 1 km gridded domain. The results demonstrate high overall model performance with an average cross-validated R2 (coefficient of determination) against observations of 0.90 and 0.86 for annual averages. Overall, the model performance of the three machine learning algorithms was quite similar. The overall model performance from the ensemble model outperformed those from any single algorithm. The East North Central region of the United States had the highest R2, 0.93, and performance was weakest for the western mountainous regions (R2 of 0.86) and New England (R2 of 0.87). For the cross validation by season, our model had the best performance during summer with an R2 of 0.88. This study can be useful for the environmental health community to more accurately estimate the health impacts of O3 over space and time, especially in health studies at an intra-urban scale.
Collapse
Affiliation(s)
- Weeberb J. Requia
- Harvard University, Department of Environmental Health, TH Chan School of Public Health, Boston, Massachusetts, United States
- School of Public Policy and Government, Fundação Getúlio Vargas, Brasília, Distrito Federal, Brazil
- Corresponding Author: SGAN 602, Asa Norte, Brasília, DF, 70830-051, Brazil,
| | - Qian Di
- Harvard University, Department of Environmental Health, TH Chan School of Public Health, Boston, Massachusetts, United States
- Research Center for Public Health, Tsinghua University, Beijing, China
| | - Rachel Silvern
- Harvard University, John A. Paulson School of Engineering and Applied Sciences, Boston, Massachusetts, United States
| | - James T. Kelly
- U.S. Environmental Protection Agency, Office of Air Quality Planning & Standards, Research Triangle Park, NC, United States
| | - Petros Koutrakis
- Harvard University, Department of Environmental Health, TH Chan School of Public Health, Boston, Massachusetts, United States
| | - Loretta J. Mickley
- Harvard University, John A. Paulson School of Engineering and Applied Sciences, Boston, Massachusetts, United States
| | - Melissa P. Sulprizio
- Harvard University, John A. Paulson School of Engineering and Applied Sciences, Boston, Massachusetts, United States
| | - Heresh Amini
- Harvard University, Department of Environmental Health, TH Chan School of Public Health, Boston, Massachusetts, United States
- Department of Public Health, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Liuhua Shi
- Harvard University, Department of Environmental Health, TH Chan School of Public Health, Boston, Massachusetts, United States
- Emory University, Gangarosa Department of Environmental Health, Rollins School of Public Health, Atlanta, Georgia, United States
| | - Joel Schwartz
- Harvard University, Department of Environmental Health, TH Chan School of Public Health, Boston, Massachusetts, United States
| |
Collapse
|
28
|
Requia WJ, Di Q, Silvern R, Kelly JT, Koutrakis P, Mickley LJ, Sulprizio MP, Amini H, Shi L, Schwartz J. An Ensemble Learning Approach for Estimating High Spatiotemporal Resolution of Ground-Level Ozone in the Contiguous United States. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2020; 54:11037-11047. [PMID: 32808786 DOI: 10.1021/acs.est.oco1791] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
In this paper, we integrated multiple types of predictor variables and three types of machine learners (neural network, random forest, and gradient boosting) into a geographically weighted ensemble model to estimate the daily maximum 8 h O3 with high resolution over both space (at 1 km × 1 km grid cells covering the contiguous United States) and time (daily estimates between 2000 and 2016). We further quantify monthly model uncertainty for our 1 km × 1 km gridded domain. The results demonstrate high overall model performance with an average cross-validated R2 (coefficient of determination) against observations of 0.90 and 0.86 for annual averages. Overall, the model performance of the three machine learning algorithms was quite similar. The overall model performance from the ensemble model outperformed those from any single algorithm. The East North Central region of the United States had the highest R2, 0.93, and performance was weakest for the western mountainous regions (R2 of 0.86) and New England (R2 of 0.87). For the cross validation by season, our model had the best performance during summer with an R2 of 0.88. This study can be useful for the environmental health community to more accurately estimate the health impacts of O3 over space and time, especially in health studies at an intra-urban scale.
Collapse
Affiliation(s)
- Weeberb J Requia
- Department of Environmental Health, Harvard University, TH Chan School of Public Health, Boston, Massachusetts 02115, United States
- School of Public Policy and Government, Fundação Getúlio Vargas, Brasília, Distrito Federal 72125590, Brazil
| | - Qian Di
- Department of Environmental Health, Harvard University, TH Chan School of Public Health, Boston, Massachusetts 02115, United States
- Research Center for Public Health, Tsinghua University, Beijing 100084, China
| | - Rachel Silvern
- Harvard University, John A. Paulson School of Engineering and Applied Sciences, Cambridge, Massachusetts 02138, United States
| | - James T Kelly
- U.S. Environmental Protection Agency, Office of Air Quality Planning and Standards, Research Triangle Park, Durham, North Carolina 27709, United States
| | - Petros Koutrakis
- Department of Environmental Health, Harvard University, TH Chan School of Public Health, Boston, Massachusetts 02115, United States
| | - Loretta J Mickley
- Harvard University, John A. Paulson School of Engineering and Applied Sciences, Cambridge, Massachusetts 02138, United States
| | - Melissa P Sulprizio
- Harvard University, John A. Paulson School of Engineering and Applied Sciences, Cambridge, Massachusetts 02138, United States
| | - Heresh Amini
- Department of Environmental Health, Harvard University, TH Chan School of Public Health, Boston, Massachusetts 02115, United States
- Department of Public Health, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen 1165, Denmark
| | - Liuhua Shi
- Department of Environmental Health, Harvard University, TH Chan School of Public Health, Boston, Massachusetts 02115, United States
- Gangarosa Department of Environmental Health, Rollins School of Public Health, Emory University, Atlanta, Georgia 30322, United States
| | - Joel Schwartz
- Department of Environmental Health, Harvard University, TH Chan School of Public Health, Boston, Massachusetts 02115, United States
| |
Collapse
|
29
|
Xi Y, Kshirsagar AV, Wade TJ, Richardson DB, Brookhart MA, Wyatt L, Rappold AG. Mortality in US Hemodialysis Patients Following Exposure to Wildfire Smoke. J Am Soc Nephrol 2020; 31:1824-1835. [PMID: 32675302 DOI: 10.1681/asn.2019101066] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2019] [Accepted: 04/09/2020] [Indexed: 01/28/2023] Open
Abstract
BACKGROUND Wildfires are increasingly a significant source of fine particulate matter (PM2.5), which has been linked to adverse health effects and increased mortality. ESKD patients are potentially susceptible to this environmental stressor. METHODS We conducted a retrospective time-series analysis of the association between daily exposure to wildfire PM2.5 and mortality in 253 counties near a major wildfire between 2008 and 2012. Using quasi-Poisson regression models, we estimated rate ratios (RRs) for all-cause mortality on the day of exposure and up to 30 days following exposure, adjusted for background PM2.5, day of week, seasonality, and heat. We stratified the analysis by causes of death (cardiac, vascular, infectious, or other) and place of death (clinical or nonclinical setting) for differential PM2.5 exposure and outcome classification. RESULTS We found 48,454 deaths matched to the 253 counties. A 10-μg/m3 increase in wildfire PM2.5 associated with a 4% increase in all-cause mortality on the same day (RR, 1.04; 95% confidence interval [95% CI], 1.01 to 1.07) and 7% increase cumulatively over 30 days following exposure (RR, 1.07; 95% CI, 1.01 to 1.12). Risk was elevated following exposure for deaths occurring in nonclinical settings (RR, 1.07; 95% CI, 1.02 to 1.12), suggesting modification of exposure by place of death. "Other" deaths (those not attributed to cardiac, vascular, or infectious causes) accounted for the largest portion of deaths and had a strong same-day effect (RR, 1.08; 95% CI, 1.03 to 1.12) and cumulative effect over the 30-day period. On days with a wildfire PM2.5 contribution >10 μg/m3, exposure accounted for 8.4% of mortality. CONCLUSIONS Wildfire smoke exposure was positively associated with all-cause mortality among patients receiving in-center hemodialysis.
Collapse
Affiliation(s)
- Yuzhi Xi
- Oak Ridge Institute for Science and Education at the United States Environmental Protection Agency, National Health and Environmental Effects Research Laboratory, Environmental Public Health Division, Research Triangle Park, North Carolina.,Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| | - Abhijit V Kshirsagar
- University of North Carolina Kidney Center and Division of Nephrology and Hypertension, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| | - Timothy J Wade
- United States Environmental Protection Agency, Center for Public Health and Environmental Assessment, Research Triangle Park, North Carolina
| | - David B Richardson
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| | - M Alan Brookhart
- Department of Population Health Sciences, Duke University, Durham, North Carolina
| | - Lauren Wyatt
- Oak Ridge Institute for Science and Education at the United States Environmental Protection Agency, National Health and Environmental Effects Research Laboratory, Environmental Public Health Division, Research Triangle Park, North Carolina
| | - Ana G Rappold
- United States Environmental Protection Agency, Center for Public Health and Environmental Assessment, Research Triangle Park, North Carolina
| |
Collapse
|
30
|
Comparisons of Diverse Machine Learning Approaches for Wildfire Susceptibility Mapping. Symmetry (Basel) 2020. [DOI: 10.3390/sym12040604] [Citation(s) in RCA: 44] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
Climate change has increased the probability of the occurrence of catastrophes like wildfires, floods, and storms across the globe in recent years. Weather conditions continue to grow more extreme, and wildfires are occurring quite frequently and are spreading with greater intensity. Wildfires ravage forest areas, as recently seen in the Amazon, the United States, and more recently in Australia. The availability of remotely sensed data has vastly improved, and enables us to precisely locate wildfires for monitoring purposes. Wildfire inventory data was created by integrating the polygons collected through field surveys using global positioning systems (GPS) and the data collected from the moderate resolution imaging spectrometer (MODIS) thermal anomalies product between 2012 and 2017 for the study area. The inventory data, along with sixteen conditioning factors selected for the study area, was used to appraise the potential of various machine learning (ML) methods for wildfire susceptibility mapping in Amol County. The ML methods chosen for this study are artificial neural network (ANN), dmine regression (DR), DM neural, least angle regression (LARS), multi-layer perceptron (MLP), random forest (RF), radial basis function (RBF), self-organizing maps (SOM), support vector machine (SVM), and decision tree (DT), along with the statistical approach of logistic regression (LR), which is very apt for wildfire susceptibility studies. The wildfire inventory data was categorized as three-fold, with 66% being used for training the models and 33% being used for accuracy assessment within three-fold cross-validation (CV). Receiver operating characteristics (ROC) was used to assess the accuracy of the ML approaches. RF had the highest accuracy of 88%, followed by SVM with an accuracy of almost 79%, and LR had the lowest accuracy of 65%. This shows that RF is better suited for wildfire susceptibility assessments in our case study area.
Collapse
|