1
|
Aman N, Panyametheekul S, Pawarmart I, Xian D, Gao L, Tian L, Manomaiphiboon K, Wang Y. Machine learning-based quantification and separation of emissions and meteorological effects on PM 2.5 in Greater Bangkok. Sci Rep 2025; 15:14775. [PMID: 40295616 PMCID: PMC12038008 DOI: 10.1038/s41598-025-99094-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2024] [Accepted: 04/16/2025] [Indexed: 04/30/2025] Open
Abstract
This study presents the first-ever application of machine learning (ML)-based meteorological normalization and Shapley additive explanations (SHAP) analysis to quantify, separate, and understand the effect of meteorology on PM2.5 over Greater Bangkok (GBK). Six ML models namely random forest (RF), adaptive boosting (ADB), gradient boosting (GB), extreme gradient boosting (XGB), light gradient boosting machine (LGBM), and cat boosting (CB) were used with meteorological factors, fire activity, land use, and socio-economic data as predictor variables. The LGBM outperformed other models achieving ρ = 0.9 (0.95), MBE = 0 (- 0.01), MAE = 5.5 (3.3) μg m-3, and RMSE = 8.7 (4.9) μg m-3 for hourly (daily) PM2.5 prediction. LGBM was used for spatiotemporal PM2.5 estimation, and meteorological normalization was applied to calculate PM2.5_emis (emission-related PM2.5) and PM2.5_met (meteorology-related PM2.5). Diurnal variation reveals higher PM2.5 levels in the morning (08-10 LT) due to increased traffic emissions and thermal inversion and a decrease in PM2.5 as the day progresses due to decreased emission and inversion dissipation. Monthly variation suggests higher PM2.5 in winter (December and January) due to emissions and stagnant meteorological conditions. Negative PM2.5_met during November, March, and April values show meteorology improves air quality, while positive values from December to February indicate stagnant winter conditions worsen it. During winter, PM2.5_emis and PM2.5 showed an increasing trend in 15.6% and 67.8% of the area while decreasing trends fell from 23.2 to 1.9%. In summer, the percentage of areas with an increasing trend rose from 18.7 to 34.6%, and decreasing areas fell from 12.6 to 6.5%. Increase in PM2.5 despite decreasing emission over a larger area, indicating limited effectiveness of mitigation measures. Winter exhibits greater PM2.5 variability due to episodic increases from changing meteorological conditions. In Bangkok and nearby areas, higher variability is mainly driven by meteorology, with more consistent emissions in Bangkok compared to rural areas affected by agricultural burning. PM2.5 and PM2.5_emis showed stronger persistence in winter than in summer, with weaker effects in Bangkok. Hurst exponent averages were 0.75, 0.76, and 0.72 for PM2.5 and 0.79, 0.8, and 0.73 for PM2.5_emis in dry, winter, and summer seasons, respectively. SHAP analysis suggested relative humidity, planetary boundary layer height, v wind, temperature, u wind, global radiation, and aerosol optical depth as the key variables affecting PM2.5 with mean absolute SHAP values of 5.29, 4.79, 4.29, 3.68, 2.37, 2.22, and 2.03, respectively. Based on these findings, some policy recommendations have been proposed.
Collapse
Affiliation(s)
- Nishit Aman
- Department of Environmental and Sustainable Engineering, Faculty of Engineering, Chulalongkorn University, Bangkok, Thailand
| | - Sirima Panyametheekul
- Department of Environmental and Sustainable Engineering, Faculty of Engineering, Chulalongkorn University, Bangkok, Thailand.
- Energy Research Institute, Chulalongkorn University, Bangkok, 10330, Thailand.
| | - Ittipol Pawarmart
- Pollution Control Department, Ministry of Natural Resources and Environment, Bangkok, Thailand
| | - Di Xian
- National Satellite Meteorological Center (National Center for Space Weather), China Meteorological Administration, Beijing, China
- Innovation Center for FengYun Meteorological Satellite (FYSIC), China Meteorological Administration, Beijing, China
- Key Laboratory of Radiometric Calibration and Validation for Environmental Satellites, China Meteorological Administration, Beijing, China
| | - Ling Gao
- National Satellite Meteorological Center (National Center for Space Weather), China Meteorological Administration, Beijing, China
- Innovation Center for FengYun Meteorological Satellite (FYSIC), China Meteorological Administration, Beijing, China
- Key Laboratory of Radiometric Calibration and Validation for Environmental Satellites, China Meteorological Administration, Beijing, China
| | - Lin Tian
- National Satellite Meteorological Center (National Center for Space Weather), China Meteorological Administration, Beijing, China
- Innovation Center for FengYun Meteorological Satellite (FYSIC), China Meteorological Administration, Beijing, China
- Key Laboratory of Radiometric Calibration and Validation for Environmental Satellites, China Meteorological Administration, Beijing, China
| | - Kasemsan Manomaiphiboon
- The Joint Graduate School of Energy and Environment, King Mongkut's University of Technology Thonburi, Bangkok, Thailand
- Center of Excellence on Energy Technology and Environment, Ministry of Higher Education, Science, Research and Innovation, Bangkok, Thailand
| | - Yangjun Wang
- School of Environmental and Chemical Engineering, Shanghai University, Shanghai, China
| |
Collapse
|
2
|
Shi H, Yang X, Tang H, Tu Y. Temporally boosting neural network for improving dynamic prediction of PM 2.5 concentration with changing and unbalanced distribution. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2025; 383:125371. [PMID: 40267806 DOI: 10.1016/j.jenvman.2025.125371] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/12/2024] [Revised: 03/17/2025] [Accepted: 04/12/2025] [Indexed: 04/25/2025]
Abstract
Increasing medical research evidence suggests that even low PM2.5 concentrations may trigger significant health issues. Hence, an accurate prediction of PM2.5 holds immense significance in securing public health safety. However, current data-drive predictive methods exhibit seasonal model performance decline and difficulties in predicting extremely high values. Those issues may stem from neglecting two crucial features in PM2.5 data streams, i.e., concept drift and imbalanced distribution. In this study, we validate this hypothesis by conducting an in-depth analysis of the characteristics of the PM2.5 data stream and the prediction errors of three mainstream models trained on this PM2.5 data stream, i.e., random forest, convolutional neural network and transformer. Based on the identified types of concept drift and the patterns of imbalanced distribution, we introduce the Temporally boosting neural network (Temp-boost), a novel ensemble learning method designed to enhance predictive accuracy by integrating static and dynamic models. Static models, which are trained on balanced historical datasets, typically receive infrequent updates. Conversely, dynamic models are trained on newly arrived data and undergo more frequent updates. We evaluated the performance of Temp-boost and the three mentioned models in predicting gridded PM2.5 concentrations across the North China Plain in 2019. Compared to the three models, the Temp-boost shows improved prediction accuracy for different seasons, with notable enhancements in high-pollution levels. Specifically, for pollution levels above lightly polluted, the Temp-boost effectively reduces the average MAE by 13.22 μgm-3, RMSE by 13.32 μgm-3 , with reductions peaking MAE at 26.45 μgm-3,RMSE at 25.76 μgm-3 in more severe case.
Collapse
Affiliation(s)
- Haoze Shi
- State Key Laboratory of Remote Sensing Science, Faculty of Geographical Science, Beijing Normal University, Beijing, 100875, PR China.
| | - Xin Yang
- State Key Laboratory of Remote Sensing Science, Faculty of Geographical Science, Beijing Normal University, Beijing, 100875, PR China.
| | - Hong Tang
- State Key Laboratory of Remote Sensing Science, Faculty of Geographical Science, Beijing Normal University, Beijing, 100875, PR China.
| | - Yuhong Tu
- State Key Laboratory of Remote Sensing Science, Faculty of Geographical Science, Beijing Normal University, Beijing, 100875, PR China.
| |
Collapse
|
3
|
Liu Z, Fang Z, Hu Y. A deep learning-based hybrid method for PM 2.5 prediction in central and western China. Sci Rep 2025; 15:10080. [PMID: 40128263 PMCID: PMC11933421 DOI: 10.1038/s41598-025-95460-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2025] [Accepted: 03/21/2025] [Indexed: 03/26/2025] Open
Abstract
To mitigate the adverse effects of air pollution, accurate PM2.5 prediction is particularly important. It is difficult for existing models to escape the limitations attached to a single model itself. This study proposes a hybrid PM2.5 prediction model utilizing deep learning techniques, which aims to complement each other's strengths through model fusion. The model integrates the transformer and LSTM architectures and employs parameter optimization through the particle swarm optimization (PSO) algorithm. The proposed model achieves superior performance by utilizing the gating mechanism of the LSTM model, the positional encoding and self-attention mechanism of the Transformer model, and PSO's robust optimization capabilities. Experimental results show that the new model outperforms both the traditional LSTM model and the PSO-LSTM model in the PM2.5 prediction task, and its evaluation metrics, R2, MAE, MBE, RMSE, and MAPE, are all improved. Furthermore, the model demonstrates stable performance across different cities and various periods. This study offers a robust approach to improving the accuracy and reliability of PM2.5 forecasting.
Collapse
Affiliation(s)
- Zuhan Liu
- School of Information Engineering, Nanchang Institute of Technology, Nanchang, 330099, China.
- Jiangxi Province Key Laboratory of Smart Water Conservancy, Nanchang, 330099, China.
| | - Zihai Fang
- School of Information Engineering, Nanchang Institute of Technology, Nanchang, 330099, China
| | - Yuanhao Hu
- School of Information Engineering, Nanchang Institute of Technology, Nanchang, 330099, China
| |
Collapse
|
4
|
Zhao S, Lin H, Wang H, Liu G, Wang X, Du K, Ren G. Spatiotemporal distribution prediction for PM 2.5 based on STXGBoost model and high-density monitoring sensors in Zhengzhou High Tech Zone, China. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2025; 373:123682. [PMID: 39700923 DOI: 10.1016/j.jenvman.2024.123682] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/18/2024] [Revised: 12/04/2024] [Accepted: 12/08/2024] [Indexed: 12/21/2024]
Abstract
The increasing demand for air pollution control has driven the application of low-cost sensors (LCS) in air quality monitoring, enabling higher observation density and improved air quality predictions. However, the inherent limitations in data quality from LCS necessitate the development of effective methodologies to optimize their application. This study established a hybrid framework to enhance the accuracy of spatiotemporal predictions of PM2.5, standard instrument measurements were employed as reference data for the remote calibration of LCS. To account for local emission characteristics, the calibration model was trained using statistical values from LCS during periods of reduced anthropogenic emissions. This calibration approach significantly improved data quality, increasing R2 values of LCS data from 0.60 to 0.85. Subsequently, an advanced predictive model, STXGBoost, was developed, combining Kriging interpolation technology with high-density LCS data to integrate temporal trends and geographic spatial correlations. The STXGBoost model effectively captured the spatiotemporal variability of PM2.5 data, producing accurate and high spatiotemporal resolution PM2.5 prediction maps, with R2 values of 0.96, 0.92, and 0.89 for 1-h, 4-h, and 48-h predictions, respectively. These findings demonstrate the feasibility of generating high-resolution urban air pollution maps by integrating high-density ground monitoring data with advanced computational approaches. This framework provides valuable support for precise management and informed decision-making in urban atmospheric environments.
Collapse
Affiliation(s)
- Shiqi Zhao
- Division of Thermophysics Metrology, National Institute of Metrology, Beijing, 100029, China; Zhengzhou Institute of Metrology, Zhengzhou, 450001, China
| | - Hong Lin
- Division of Thermophysics Metrology, National Institute of Metrology, Beijing, 100029, China; Zhengzhou Institute of Metrology, Zhengzhou, 450001, China
| | - Hongjun Wang
- Division of Thermophysics Metrology, National Institute of Metrology, Beijing, 100029, China
| | - Gege Liu
- Division of Thermophysics Metrology, National Institute of Metrology, Beijing, 100029, China; Zhengzhou Institute of Metrology, Zhengzhou, 450001, China
| | - Xiaoning Wang
- Zhengzhou Institute of Metrology, Zhengzhou, 450001, China
| | - Kailun Du
- Zhengzhou Institute of Metrology, Zhengzhou, 450001, China
| | - Ge Ren
- Division of Thermophysics Metrology, National Institute of Metrology, Beijing, 100029, China; Zhengzhou Institute of Metrology, Zhengzhou, 450001, China.
| |
Collapse
|
5
|
Alotaibi S, Almujibah H, Mohamed KAA, Elhassan AAM, Alsulami BT, Alsaluli A, Khattak A. Towards Cleaner Cities: Estimating Vehicle-Induced PM 2.5 with Hybrid EBM-CMA-ES Modeling. TOXICS 2024; 12:827. [PMID: 39591005 PMCID: PMC11598042 DOI: 10.3390/toxics12110827] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/08/2024] [Revised: 11/15/2024] [Accepted: 11/17/2024] [Indexed: 11/28/2024]
Abstract
In developing countries, vehicle emissions are a major source of atmospheric pollution, worsened by aging vehicle fleets and less stringent emissions regulations. This results in elevated levels of particulate matter, contributing to the degradation of urban air quality and increasing concerns over the broader effects of atmospheric emissions on human health. This study proposes a Hybrid Explainable Boosting Machine (EBM) framework, optimized using the Covariance Matrix Adaptation Evolution Strategy (CMA-ES), to predict vehicle-related PM2.5 concentrations and analyze contributing factors. Air quality data were collected from Open-Seneca sensors installed along the Nairobi Expressway, alongside meteorological and traffic data. The CMA-ES-tuned EBM model achieved a Mean Absolute Error (MAE) of 2.033 and an R2 of 0.843, outperforming other models. A key strength of the EBM is its interpretability, revealing that the location was the most critical factor influencing PM2.5 concentrations, followed by humidity and temperature. Elevated PM2.5 levels were observed near the Westlands roundabout, and medium to high humidity correlated with higher PM2.5 levels. Furthermore, the interaction between humidity and traffic volume played a significant role in determining PM2.5 concentrations. By combining CMA-ES for hyperparameter optimization and EBM for prediction and interpretation, this study provides both high predictive accuracy and valuable insights into the environmental drivers of urban air pollution, providing practical guidance for air quality management.
Collapse
Affiliation(s)
- Saleh Alotaibi
- Civil and Environmental Engineering Department, Faculty of Engineering—Rabigh Branch, King Abdulaziz University, Jeddah 21589, Saudi Arabia
| | - Hamad Almujibah
- Department of Civil Engineering, College of Engineering, Taif University, Taif 21944, Saudi Arabia; (H.A.); (A.A.M.E.); (A.A.)
| | - Khalaf Alla Adam Mohamed
- Department of Civil Engineering, College of Engineering, Bisha University, Bisha 61361, Saudi Arabia;
| | - Adil A. M. Elhassan
- Department of Civil Engineering, College of Engineering, Taif University, Taif 21944, Saudi Arabia; (H.A.); (A.A.M.E.); (A.A.)
| | - Badr T. Alsulami
- Department of Civil Engineering, College of Engineering and Architecture, Umm Al-Qura University, Makkah 24382, Saudi Arabia;
| | - Abdullah Alsaluli
- Department of Civil Engineering, College of Engineering, Taif University, Taif 21944, Saudi Arabia; (H.A.); (A.A.M.E.); (A.A.)
| | - Afaq Khattak
- Department of Civil, Structural and Environmental Engineering, Trinity College Dublin, D02 PN40 Dublin, Ireland
| |
Collapse
|
6
|
Hu Y, Li Q, Shi X, Yan J, Chen Y. Domain knowledge-enhanced multi-spatial multi-temporal PM 2.5 forecasting with integrated monitoring and reanalysis data. ENVIRONMENT INTERNATIONAL 2024; 192:108997. [PMID: 39293234 DOI: 10.1016/j.envint.2024.108997] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/17/2024] [Revised: 07/31/2024] [Accepted: 09/02/2024] [Indexed: 09/20/2024]
Abstract
Accurate air quality forecasting is crucial for public health, environmental monitoring and protection, and urban planning. However, existing methods fail to effectively utilize multi-scale information, both spatially and temporally. There is a lack of integration between individual monitoring stations and city-wide scales. Temporally, the periodic nature of air quality variations is often overlooked or inadequately considered. To overcome these limitations, we conduct a thorough analysis of the data and tasks, integrating spatio-temporal multi-scale domain knowledge. We present a novel Multi-spatial Multi-temporal air quality forecasting method based on Graph Convolutional Networks and Gated Recurrent Units (M2G2), bridging the gap in air quality forecasting across spatial and temporal scales. The proposed framework consists of two modules: Multi-scale Spatial GCN (MS-GCN) for spatial information fusion and Multi-scale Temporal GRU (MT-GRU) for temporal information integration. In the spatial dimension, the MS-GCN module employs a bidirectional learnable structure and a residual structure, enabling comprehensive information exchange between individual monitoring stations and the city-scale graph. Regarding the temporal dimension, the MT-GRU module adaptively combines information from different temporal scales through parallel hidden states. Leveraging meteorological indicators and four air quality indicators, we present comprehensive comparative analyses and ablation experiments, showcasing the higher accuracy of M2G2 in comparison to nine currently available advanced approaches across all aspects. The improvements of M2G2 over the second-best method on RMSE of 72-h future predictions are as follows: PM2.5: 6%∼10%; PM10: 5%∼7%; NO2: 5%∼16%; O3: 6%∼9%. Furthermore, we demonstrate the effectiveness of each module of M2G2 by ablation study. We conduct a sensitivity analysis of air quality and meteorological data, finding that the introduction of O3 adversely impacts the prediction accuracy of PM2.5.
Collapse
Affiliation(s)
- Yuxiao Hu
- Department of Building Environment and Energy Engineering, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong, China; Ningbo Institute of Digital Twin, Eastern Institute of Technology, Ningbo 315200, China
| | - Qian Li
- Ningbo Institute of Digital Twin, Eastern Institute of Technology, Ningbo 315200, China; School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Xiaodan Shi
- School of Business, Society and Technology, Mälardalens University, Västerås 72123, Sweden
| | - Jinyue Yan
- Department of Building Environment and Energy Engineering, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong, China
| | - Yuntian Chen
- Ningbo Institute of Digital Twin, Eastern Institute of Technology, Ningbo 315200, China
| |
Collapse
|
7
|
Rakholia R, Le Q, Vu K, Ho BQ, Carbajo RS. Accurate PM 2.5 urban air pollution forecasting using multivariate ensemble learning Accounting for evolving target distributions. CHEMOSPHERE 2024; 364:143097. [PMID: 39154769 DOI: 10.1016/j.chemosphere.2024.143097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/19/2024] [Revised: 07/28/2024] [Accepted: 08/13/2024] [Indexed: 08/20/2024]
Abstract
Over the past decades, air pollution has caused severe environmental and public health problems. According to the World Health Organization (WHO), fine particulate matter (PM2.5), a key component reflecting air quality, is the fourth leading cause of death worldwide after cardiovascular disease, smoking, and diet. Various research efforts have aimed to develop PM2.5 forecasting models that can be integrated into a solution to mitigate the adverse effects of air pollution. However, PM2.5 forecasting is challenging because air pollution data are non-stationary and influenced by multiple random effects. This paper proposes an effective multivariate multi-step ensemble machine learning model for predicting continuous 24-h PM2.5 concentrations, considering meteorological conditions, the rolling mean of PM2.5 time series, and temporal features. PM2.5 is strongly correlated with space and time. Therefore, forecasting results from one location are insufficient to represent the level of air pollution for an entire city. In this study, we established six real-time air quality monitoring sites in different regions, including traffic, residential, and industrial areas in Ho Chi Minh City (HCMC), and generated forecasting results for each station. Various statistical methods are incorporated to evaluate the performance of the model. The experimental results confirm that the model performs well, substantially improving its forecasting accuracy compared to existing PM2.5 forecasting models developed for HCMC. In addition, we analyze to determine the contribution of different feature groups to model performance. The model can serve as a reference for citizens scheduling local travel and for healthcare providers to provide early warnings.
Collapse
Affiliation(s)
- Rajnish Rakholia
- Ireland's National Centre for Artificial Intelligence (CeADAR), University College Dublin, NexusUCD, Belfield Office Park, Dublin, Ireland
| | - Quan Le
- Ireland's National Centre for Artificial Intelligence (CeADAR), University College Dublin, NexusUCD, Belfield Office Park, Dublin, Ireland.
| | - Khue Vu
- Institute for Environment and Resources (IER), Ho Chi Minh City, 700000, Viet Nam
| | - Bang Quoc Ho
- Institute for Environment and Resources (IER), Ho Chi Minh City, 700000, Viet Nam; Department of Science and Technology, Vietnam National University, Ho Chi Minh City, 700000, Viet Nam
| | - Ricardo Simon Carbajo
- Ireland's National Centre for Artificial Intelligence (CeADAR), University College Dublin, NexusUCD, Belfield Office Park, Dublin, Ireland
| |
Collapse
|
8
|
Zhou S, Wang W, Zhu L, Qiao Q, Kang Y. Deep-learning architecture for PM 2.5 concentration prediction: A review. ENVIRONMENTAL SCIENCE AND ECOTECHNOLOGY 2024; 21:100400. [PMID: 38439920 PMCID: PMC10910069 DOI: 10.1016/j.ese.2024.100400] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Revised: 02/05/2024] [Accepted: 02/06/2024] [Indexed: 03/06/2024]
Abstract
Accurately predicting the concentration of fine particulate matter (PM2.5) is crucial for evaluating air pollution levels and public exposure. Recent advancements have seen a significant rise in using deep learning (DL) models for forecasting PM2.5 concentrations. Nonetheless, there is a lack of unified and standardized frameworks for assessing the performance of DL-based PM2.5 prediction models. Here we extensively reviewed those DL-based hybrid models for forecasting PM2.5 levels according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. We examined the similarities and differences among various DL models in predicting PM2.5 by comparing their complexity and effectiveness. We categorized PM2.5 DL methodologies into seven types based on performance and application conditions, including four types of DL-based models and three types of hybrid learning models. Our research indicates that established deep learning architectures are commonly used and respected for their efficiency. However, many of these models often fall short in terms of innovation and interpretability. Conversely, models hybrid with traditional approaches, like deterministic and statistical models, exhibit high interpretability but compromise on accuracy and speed. Besides, hybrid DL models, representing the pinnacle of innovation among the studied models, encounter issues with interpretability. We introduce a novel three-dimensional evaluation framework, i.e., Dataset-Method-Experiment Standard (DMES) to unify and standardize the evaluation for PM2.5 predictions using DL models. This review provides a framework for future evaluations of DL-based models, which could inspire researchers to standardize DL model usage in PM2.5 prediction and improve the quality of related studies.
Collapse
Affiliation(s)
- Shiyun Zhou
- Institute of Environmental Information, Chinese Research Academy of Environmental Sciences, Beijing 100012, China
- School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081, China
| | - Wei Wang
- Institute of Environmental Information, Chinese Research Academy of Environmental Sciences, Beijing 100012, China
| | - Long Zhu
- College of Water Sciences, Beijing Normal University, Beijing 100875, China
| | - Qi Qiao
- Institute of Environmental Information, Chinese Research Academy of Environmental Sciences, Beijing 100012, China
| | - Yulin Kang
- Institute of Environmental Information, Chinese Research Academy of Environmental Sciences, Beijing 100012, China
| |
Collapse
|
9
|
Ghahremanloo M, Choi Y, Singh D. Deep learning bias correction of GEMS tropospheric NO 2: A comparative validation of NO 2 from GEMS and TROPOMI using Pandora observations. ENVIRONMENT INTERNATIONAL 2024; 190:108818. [PMID: 38878653 DOI: 10.1016/j.envint.2024.108818] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/10/2024] [Revised: 06/05/2024] [Accepted: 06/11/2024] [Indexed: 08/28/2024]
Abstract
Despite advancements in satellite instruments, such as those in geostationary orbit, biases continue to affect the accuracy of satellite data. This research pioneers the use of a deep convolutional neural network to correct bias in tropospheric column density of NO2 (TCDNO2) from the Geostationary Environment Monitoring Spectrometer (GEMS) during 2021-2023. Initially, we validate GEMS TCDNO2 against Pandora observations and compare its accuracy with measurements from the TROPOspheric Monitoring Instrument (TROPOMI). GEMS displays acceptable accuracy in TCDNO2 measurements, with a correlation coefficient (R) of 0.68, an index of agreement (IOA) of 0.79, and a mean absolute bias (MAB) of 5.73321 × 1015 molecules/cm2, though it is not highly accurate. The evaluation showcases moderate to high accuracy of GEMS TCDNO2 across all Pandora stations, with R values spanning from 0.46 to 0.80. Comparing TCDNO2 from GEMS and TROPOMI at TROPOMI overpass time shows satisfactory performance of GEMS TCDNO2 measurements, achieving R, IOA, and MAB values of 0.71, 0.78, and 6.82182 × 1015 molecules/cm2, respectively. However, these figures are overshadowed by TROPOMI's superior accuracy, which reports R, IOA, and MAB values of 0.81, 0.89, and 3.26769 × 1015 molecules/cm2, respectively. While GEMS overestimates TCDNO2 by 52 % at TROPOMI overpass time, TROPOMI underestimates it by 9 %. The deep learning bias corrected GEMS TCDNO2 (GEMS-DL) demonstrates a marked enhancement in the accuracy of original GEMS TCDNO2 measurements. The GEMS-DL product improves R from 0.68 to 0.88, IOA from 0.79 to 0.93, MAB from 5.73321 × 1015 to 2.67659 × 1015 molecules/cm2, and reduces MAB percentage (MABP) from 64 % to 30 %. This represents a significant reduction in bias, exceeding 50 %. Although the original GEMS product overestimates TCDNO2 by 28 %, the GEMS-DL product remarkably minimizes this error, underestimating TCDNO2 by a mere 1 %. Spatial cross-validation across Pandora stations shows a significant reduction in MABP, from a range of 45 %-105.6 % in original GEMS data to 24 %-59 % in GEMS-DL.
Collapse
Affiliation(s)
- Masoud Ghahremanloo
- Department of Earth and Atmospheric Sciences, University of Houston, Houston, TX, USA 77004.
| | - Yunsoo Choi
- Department of Earth and Atmospheric Sciences, University of Houston, Houston, TX, USA 77004.
| | - Deveshwar Singh
- Department of Earth and Atmospheric Sciences, University of Houston, Houston, TX, USA 77004.
| |
Collapse
|
10
|
McCracken T, Chen P, Metcalf A, Fan C. Quantifying the impacts of Canadian wildfires on regional air pollution networks. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 928:172461. [PMID: 38615767 DOI: 10.1016/j.scitotenv.2024.172461] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Revised: 04/10/2024] [Accepted: 04/11/2024] [Indexed: 04/16/2024]
Abstract
Wildfire smoke greatly impacts regional atmospheric systems, causing changes in the behavior of pollution. However, the impacts of wildfire smoke on pollution behavior are not easily quantifiable due to the complex nature of atmospheric systems. Air pollution correlation networks have been used to quantify air pollution behavior during ambient conditions. However, it is unknown how extreme pollution events impact these networks. Therefore, we propose a multidimensional air pollution correlation network framework to quantify the impacts of wildfires on air pollution behavior. The impacts are quantified by comparing two time periods, one during the 2023 Canadian wildfires and one during normal conditions with two complex network types for each period. In this study, the value network represents PM2.5 concentrations and the rate network represents the rate of change of PM2.5 concentrations. Wildfires' impacts on air pollution behavior are captured by structural changes in the networks. The wildfires caused a discontinuous phase transition during percolation in both network types which represents non-random organization of the most significant spatiotemporal correlations. Additionally, wildfires caused changes to the connectivity of stations leading to more interconnected networks with different influential stations. During the wildfire period, highly polluted areas are more likely to form connections in the network, quantified by an 86 % and 19 % increase in the connectivity of the value and rate networks respectively compared to the normal period. In this study, we create novel understandings of the impacts of wildfires on air pollution correlation networks, show how our method can create important insights into air pollution patterns, and discuss potential applications of our methodologies. This study aims to enhance capabilities for wildfire smoke exposure mitigation and response strategies.
Collapse
Affiliation(s)
- Teague McCracken
- School of Civil and Environmental Engineering, Clemson University, 455 Bracket Hall, Clemson, SC 29631, USA.
| | - Pei Chen
- Department of Computer Science and Engineering, Texas A&M University, L.F. Peterson Building, College Station, TX 77843, USA.
| | - Andrew Metcalf
- School of Civil and Environmental Engineering, Clemson University, 455 Bracket Hall, Clemson, SC 29631, USA.
| | - Chao Fan
- School of Civil and Environmental Engineering, Clemson University, 455 Bracket Hall, Clemson, SC 29631, USA.
| |
Collapse
|
11
|
Chaves MGD, da Silva AB, Mercuri EGF, Noe SM. Particulate matter forecast and prediction in Curitiba using machine learning. Front Big Data 2024; 7:1412837. [PMID: 38873282 PMCID: PMC11169811 DOI: 10.3389/fdata.2024.1412837] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2024] [Accepted: 05/17/2024] [Indexed: 06/15/2024] Open
Abstract
Introduction Air quality is directly affected by pollutant emission from vehicles, especially in large cities and metropolitan areas or when there is no compliance check for vehicle emission standards. Particulate Matter (PM) is one of the pollutants emitted from fuel burning in internal combustion engines and remains suspended in the atmosphere, causing respiratory and cardiovascular health problems to the population. In this study, we analyzed the interaction between vehicular emissions, meteorological variables, and particulate matter concentrations in the lower atmosphere, presenting methods for predicting and forecasting PM2.5. Methods Meteorological and vehicle flow data from the city of Curitiba, Brazil, and particulate matter concentration data from optical sensors installed in the city between 2020 and 2022 were organized in hourly and daily averages. Prediction and forecasting were based on two machine learning models: Random Forest (RF) and Long Short-Term Memory (LSTM) neural network. The baseline model for prediction was chosen as the Multiple Linear Regression (MLR) model, and for forecast, we used the naive estimation as baseline. Results RF showed that on hourly and daily prediction scales, the planetary boundary layer height was the most important variable, followed by wind gust and wind velocity in hourly or daily cases, respectively. The highest PM prediction accuracy (99.37%) was found using the RF model on a daily scale. For forecasting, the highest accuracy was 99.71% using the LSTM model for 1-h forecast horizon with 5 h of previous data used as input variables. Discussion The RF and LSTM models were able to improve prediction and forecasting compared with MLR and Naive, respectively. The LSTM was trained with data corresponding to the period of the COVID-19 pandemic (2020 and 2021) and was able to forecast the concentration of PM2.5 in 2022, in which the data show that there was greater circulation of vehicles and higher peaks in the concentration of PM2.5. Our results can help the physical understanding of factors influencing pollutant dispersion from vehicle emissions at the lower atmosphere in urban environment. This study supports the formulation of new government policies to mitigate the impact of vehicle emissions in large cities.
Collapse
Affiliation(s)
| | | | | | - Steffen Manfred Noe
- Institute of Forestry and Engineering, Estonian University of Life Sciences, Tartu, Estonia
| |
Collapse
|
12
|
Xia Y, McCracken T, Liu T, Chen P, Metcalf A, Fan C. Understanding the Disparities of PM2.5 Air Pollution in Urban Areas via Deep Support Vector Regression. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2024; 58:8404-8416. [PMID: 38698567 DOI: 10.1021/acs.est.3c09177] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2024]
Abstract
In densely populated urban areas, PM2.5 has a direct impact on the health and quality of residents' life. Thus, understanding the disparities of PM2.5 is crucial for ensuring urban sustainability and public health. Traditional prediction models often overlook the spillover effects within urban areas and the complexity of the data, leading to inaccurate spatial predictions of PM2.5. We propose Deep Support Vector Regression (DSVR) that models the urban areas as a graph, with grid center points as the nodes and the connections between grids as the edges. Nature and human activity features of each grid are initialized as the representation of each node. Based on the graph, DSVR uses random diffusion-based deep learning to quantify the spillover effects of PM2.5. It leverages random walk to uncover more extensive spillover relationships between nodes, thereby capturing both the local and nonlocal spillover effects of PM2.5. And then it engages in predictive learning using the feature vectors that encapsulate spillover effects, enhancing the understanding of PM2.5 disparities and connections across different regions. By applying our proposed model in the northern region of New York for predictive performance analysis, we found that DSVR consistently outperforms other models. During periods of PM2.5 surges, the R-square of DSVR reaches as high as 0.729, outperforming non-spillover models by 2.5 to 5.7 times and traditional spatial metric models by 2.2 to 4.6 times. Therefore, our proposed model holds significant importance for understanding disparities of PM2.5 air pollution in urban areas, taking the first steps toward a new method that considers both the spillover effects and nonlinear feature of data for prediction.
Collapse
Affiliation(s)
- Yuling Xia
- School of Mathematics, Southwest Jiaotong University, Sichuan province Chengdu 611756, China
| | - Teague McCracken
- School of Civil and Environmental Engineering and Earth Sciences, Clemson University, Clemson, South Carolina 29634, United States
| | - Tong Liu
- School of Civil and Environmental Engineering and Earth Sciences, Clemson University, Clemson, South Carolina 29634, United States
| | - Pei Chen
- Department of Computer Science and Engineering, Texas A&M University, College Station, Texas 77843, United States
| | - Andrew Metcalf
- School of Civil and Environmental Engineering and Earth Sciences, Clemson University, Clemson, South Carolina 29634, United States
| | - Chao Fan
- School of Civil and Environmental Engineering and Earth Sciences, Clemson University, Clemson, South Carolina 29634, United States
| |
Collapse
|
13
|
Ma Z, Wang B, Luo W, Jiang J, Liu D, Wei H, Luo H. Air pollutant prediction model based on transfer learning two-stage attention mechanism. Sci Rep 2024; 14:7385. [PMID: 38548823 PMCID: PMC10978953 DOI: 10.1038/s41598-024-57784-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Accepted: 03/21/2024] [Indexed: 04/01/2024] Open
Abstract
Atmospheric pollution significantly impacts the regional economy and human health, and its prediction has been increasingly emphasized. The performance of traditional prediction methods is limited due to the lack of historical data support in new atmospheric monitoring sites. Therefore, this paper proposes a two-stage attention mechanism model based on transfer learning (TL-AdaBiGRU). First, the first stage of the model utilizes a temporal distribution characterization algorithm to segment the air pollutant sequences into periods. It introduces a temporal attention mechanism to assign self-learning weights to the period segments in order to filter out essential period features. Then, in the second stage of the model, a multi-head external attention mechanism is introduced to mine the network's hidden layer key features. Finally, the adequate knowledge learned by the model at the source domain site is migrated to the new site to improve the prediction capability of the new site. The results show that (1) the model is modeled from the data distribution perspective, and the critical information within the sequence of periodic segments is mined in depth. (2) The model employs a unique two-stage attention mechanism to capture complex nonlinear relationships in air pollutant data. (3) Compared with the existing models, the mean absolute error (MAE), root mean square error (RMSE), and mean absolute percentage error (MAPE) of the model decreased by 14%, 13%, and 4%, respectively, and the prediction accuracy was greatly improved.
Collapse
Affiliation(s)
- Zhanfei Ma
- School of Information Science and Technology, Baotou Teachers' College, Baotou, 014010, Inner Mongolia, China
- School of Information Engineering, Inner Mongolia University of Science and Technology, Baotou, 014010, Inner Mongolia, China
| | - Bisheng Wang
- School of Information Engineering, Inner Mongolia University of Science and Technology, Baotou, 014010, Inner Mongolia, China.
| | - Wenli Luo
- School of Information Engineering, Inner Mongolia University of Science and Technology, Baotou, 014010, Inner Mongolia, China
| | - Jing Jiang
- School of Information Engineering, Inner Mongolia University of Science and Technology, Baotou, 014010, Inner Mongolia, China
| | - Dongxiang Liu
- School of Information Science and Technology, Baotou Teachers' College, Baotou, 014010, Inner Mongolia, China
| | - Hui Wei
- School of Information Engineering, Inner Mongolia University of Science and Technology, Baotou, 014010, Inner Mongolia, China
| | - HaoYe Luo
- School of Information Science and Technology, Baotou Teachers' College, Baotou, 014010, Inner Mongolia, China
| |
Collapse
|
14
|
Lee YM, Lin GY, Le TC, Hong GH, Aggarwal SG, Yu JY, Tsai CJ. Characterization of spatial-temporal distribution and microenvironment source contribution of PM 2.5 concentrations using a low-cost sensor network with artificial neural network/kriging techniques. ENVIRONMENTAL RESEARCH 2024; 244:117906. [PMID: 38101720 DOI: 10.1016/j.envres.2023.117906] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Revised: 12/07/2023] [Accepted: 12/08/2023] [Indexed: 12/17/2023]
Abstract
Low-cost sensors (LCS) network is widely used to improve the resolution of spatial-temporal distribution of air pollutant concentrations in urban areas. However, studies on air pollution sources contribution to the microenvironment, especially in industrial and mix-used housing areas, still need to be completed. This study investigated the spatial-temporal distribution and source contributions of PM2.5 in the urban area based on 6-month of the LCS network datasets. The Artificial Neural Network (ANN) was used to calibrate the measured PM2.5 by the LCS network. The calibrated PM2.5 were shown to agree with reference PM2.5 measured by the BAM-1020 with R2 of 0.85, MNE of 30.91%, and RMSE of 3.73 μg/m3, which meet the criteria for hotspot identification and personal exposure study purposes. The Kriging method was further used to establish the spatial-temporal distribution of PM2.5 concentrations in the urban area. Results showed that the highest average PM2.5 concentration occurred during autumn and winter due to monsoon and topographic effects. From a diurnal perspective, the highest level of PM2.5 concentration was observed during the daytime due to heavy traffic emissions and industrial production. Based on the present ANN-based microenvironment source contribution assessment model, temples, fried chicken shops, traffic emissions in shopping and residential zones, and industrial activities such as the mechanical manufacturing and precision metal machining were identified as the sources of PM2.5. The numerical algorithm coupled with the LCS network presented in this study is a practical framework for PM2.5 hotspots and source identification, aiding decision-makers in reducing atmospheric PM2.5 concentrations and formulating regional air pollution control strategies.
Collapse
Affiliation(s)
- Yi-Ming Lee
- Institute of Environmental Engineering, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
| | - Guan-Yu Lin
- Department of Environmental Science and Engineering, Tunghai University, Taichung, Taiwan.
| | - Thi-Cuc Le
- Institute of Environmental Engineering, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
| | - Gung-Hwa Hong
- Institute of Environmental Engineering, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
| | - Shankar G Aggarwal
- Environmental Sciences & Biomedical Metrology Division, CSIR-National Physical Laboratory, New Delhi, India
| | - Jhih-Yuan Yu
- Division Chief, Department of Environmental Monitoring and Information Management, Environmental Protection Administration, Taiwan
| | - Chuen-Jinn Tsai
- Institute of Environmental Engineering, National Yang Ming Chiao Tung University, Hsinchu, Taiwan.
| |
Collapse
|
15
|
Wang H, Zhang L, Wu R, Cen Y. Spatio-temporal fusion of meteorological factors for multi-site PM2.5 prediction: A deep learning and time-variant graph approach. ENVIRONMENTAL RESEARCH 2023; 239:117286. [PMID: 37797668 DOI: 10.1016/j.envres.2023.117286] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Revised: 09/29/2023] [Accepted: 09/30/2023] [Indexed: 10/07/2023]
Abstract
In the field of environmental science, traditional methods for predicting PM2.5 concentrations primarily focus on singular temporal or spatial dimensions. This approach presents certain limitations when it comes to deeply mining the joint influence of multiple monitoring sites and their inherent connections with meteorological factors. To address this issue, we introduce an innovative deep-learning-based multi-graph model using Beijing as the study case. This model consists of two key modules: firstly, the 'Meteorological Factor Spatio-Temporal Feature Extraction Module'. This module deeply integrates spatio-temporal features of hourly meteorological data by employing Graph Convolutional Networks (GCN) and Long Short-Term Memory (LSTM) for spatial and temporal encoding respectively. Subsequently, through an attention mechanism, it retrieves a feature tensor associated with air pollutants. Secondly, these features are amalgamated with PM2.5 concentration values, allowing the 'PM2.5 Concentration Prediction Module' to predict with enhanced accuracy the joint influence across multiple monitoring sites. Our model exhibits significant advantages over traditional methods in processing the joint impact of multiple sites and their associated meteorological factors. By providing new perspectives and tools for the in-depth understanding of urban air pollutant distribution and optimization of air quality management, this model propels us towards a more comprehensive approach in tackling air pollution issues.
Collapse
Affiliation(s)
- Hongqing Wang
- Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing, 100094, China; University of Chinese Academy of Sciences, Beijing, 100049, China.
| | - Lifu Zhang
- Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing, 100094, China.
| | - Rong Wu
- Department of Mathematical Sciences, Tsinghua University, Beijing, 100084, China.
| | - Yi Cen
- Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing, 100094, China.
| |
Collapse
|
16
|
Park J, Yang JH, Jung J, Kwak IS, Choe JK, An J. Comparative analysis of the capability of the extended biotic ligand model and machine learning approaches to predict arsenate toxicity. CHEMOSPHERE 2023; 344:140350. [PMID: 37793548 DOI: 10.1016/j.chemosphere.2023.140350] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/08/2023] [Revised: 09/04/2023] [Accepted: 10/01/2023] [Indexed: 10/06/2023]
Abstract
Assessment of inorganic arsenate (As(V)) is critical for ensuring a sustainable environment because of its adverse effects on humans and ecosystems. This study is the first to attempt to predict As(V) toxicity to the bioluminescent bacterium Aliivibrio fischeri exposed to varying As(V) dosages and environmental factors (pH and phosphate concentration) using six machine learning (ML)-guided models. The predicted toxicity values were compared with those predicted using the extended biotic ligand model (BLM) we previously developed to evaluate the toxic effect of oxyanion (i.e., As(V)). The relationship between the variables (input features) and toxicity (output) was found to play an important role in the prediction accuracy of each ML-guided model. The results indicated that the extended BLM had the highest prediction accuracy, with a root mean square error (RMSE) of 12.997. However, with an RMSE of 14.361, the multilayer perceptron (MLP) model exhibited quasi-accurate prediction, despite having been trained with a relatively small dataset (n = 256). In view of simplicity, an MLP model is compatible with an extended BLM and does not require expert knowledge for the derivation of specific parameters, such as binding fraction and binding constant values. Furthermore, with the development and employment of reliable in-situ sensing techniques, monitoring data are expected to be augmented faster to provide sufficient training data for the improvement of prediction accuracy which may, thus, allow it to outperform the extended BLM after obtaining sufficient data.
Collapse
Affiliation(s)
- Junyoung Park
- Department of Civil and Environmental Engineering, Seoul National University, Seoul, 08826, South Korea; Institute of Construction and Environmental Engineering, Seoul National University, Seoul, 08826, South Korea
| | - Jae Hwan Yang
- Division of Urban Planning and Transportation, Seoul Institute, Seoul, 06756, South Korea
| | - Jihyeun Jung
- Department of Civil and Environmental Engineering, Seoul National University, Seoul, 08826, South Korea
| | - Ihn-Sil Kwak
- Department of Ocean Integrated Science, Chonnam National University, Yeosu, 59626, South Korea
| | - Jong Kwon Choe
- Department of Civil and Environmental Engineering, Seoul National University, Seoul, 08826, South Korea
| | - Jinsung An
- Department of Civil & Environmental Engineering, Hanyang University, Ansan, 15588, South Korea.
| |
Collapse
|
17
|
Ameri R, Hsu CC, Band SS, Zamani M, Shu CM, Khorsandroo S. Forecasting PM 2.5 concentration based on integrating of CEEMDAN decomposition method with SVM and LSTM. ECOTOXICOLOGY AND ENVIRONMENTAL SAFETY 2023; 266:115572. [PMID: 37837695 DOI: 10.1016/j.ecoenv.2023.115572] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Revised: 09/28/2023] [Accepted: 10/09/2023] [Indexed: 10/16/2023]
Abstract
With urbanization and increasing consumption, there is a growing need to prioritize sustainable development across various industries. Particularly, sustainable development is hindered by air pollution, which poses a threat to both living organisms and the environment. The emission of combustion gases containing particulate matter (PM 2.5) during human and social activities is a major cause of air pollution. To mitigate health risks, it is crucial to have accurate and reliable methods for forecasting PM 2.5 levels. In this study, we propose a novel approach that combines support vector machine (SVM) and long short-term memory (LSTM) with complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) to forecast PM 2.5 concentrations. The methodology involves extracting Intrinsic mode function (IMF) components through CEEMDAN and subsequently applying different regression models (SVM and LSTM) to forecast each component. The Naive Evolution algorithm is employed to determine the optimal parameters for combining CEEMDAN, SVM, and LSTM. Daily PM 2.5 concentrations in Kaohsiung, Taiwan from 2019 to 2021 were collected to train models and evaluate their performance. The performance of the proposed model is evaluated using metrics such as mean absolute error (MAE), mean square error (MSE), root mean square error (RMSE), and coefficient of determination (R2) for each district. Overall, our proposed model demonstrates superior performance in terms of MAE (1.858), MSE (7.2449), RMSE (2.6682), and (0.9169) values compared to other methods for 1-day ahead PM 2.5 forecasting. Furthermore, our proposed model also achieves the best performance in forecasting PM 2.5 for 3- and 7-day ahead predictions.
Collapse
Affiliation(s)
- Rasoul Ameri
- Department of Information Management, National Yunlin University of Science and Technology, Douliou, Taiwan
| | - Chung-Chian Hsu
- Department of Information Management, International Graduate Institute of Artificial Intelligence, National Yunlin University of Science and Technology, Douliou, Taiwan.
| | - Shahab S Band
- Department of Information Management, International Graduate Institute of Artificial Intelligence, National Yunlin University of Science and Technology, Douliou, Taiwan; Future Technology Research Center, National Yunlin University of Science and Technology, Douliou, Taiwan.
| | - Mazdak Zamani
- Department of Computer Science, New York University, 251 Mercer, New York, NY 10012, USA
| | - Chi-Min Shu
- Graduate School of Engineering Science and Technology, National Yunlin University of Science and Technology, Yunlin, 64002, Taiwan
| | - Sajad Khorsandroo
- Department of Computer Science, North Carolina A&T State University, Greensboro, NC 27411, USA
| |
Collapse
|
18
|
AlShehhi A, Welsch R. Artificial intelligence for improving Nitrogen Dioxide forecasting of Abu Dhabi environment agency ground-based stations. JOURNAL OF BIG DATA 2023; 10:92. [PMID: 37303479 PMCID: PMC10236404 DOI: 10.1186/s40537-023-00754-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Accepted: 05/08/2023] [Indexed: 06/13/2023]
Abstract
Nitrogen Dioxide (NO2 ) is a common air pollutant associated with several adverse health problems such as pediatric asthma, cardiovascular mortality,and respiratory mortality. Due to the urgent society's need to reduce pollutant concentration, several scientific efforts have been allocated to understand pollutant patterns and predict pollutants' future concentrations using machine learning and deep learning techniques. The latter techniques have recently gained much attention due it's capability to tackle complex and challenging problems in computer vision, natural language processing, etc. In the NO2 context, there is still a research gap in adopting those advanced methods to predict the concentration of pollutants. This study fills in the gap by comparing the performance of several state-of-the-art artificial intelligence models that haven't been adopted in this context yet. The models were trained using time series cross-validation on a rolling base and tested across different periods using NO2 data from 20 monitoring ground-based stations collected by Environment Agency- Abu Dhabi, United Arab Emirates. Using the seasonal Mann-Kendall trend test and Sen's slope estimator, we further explored and investigated the pollutants trends across the different stations. This study is the first comprehensive study that reported the temporal characteristic of NO2 across seven environmental assessment points and compared the performance of the state-of-the-art deep learning models for predicting the pollutants' future concentration. Our results reveal a difference in the pollutants concentrations level due to the geographic location of the different stations, with a statistically significant decrease in the NO2 annual trend for the majority of the stations. Overall, NO2 concentrations exhibit a similar daily and weekly pattern across the different stations, with an increase in the pollutants level during the early morning and the first working day. Comparing the state-of-the-art model performance transformer model demonstrate the superiority of ( MAE:0.04 (± 0.04),MSE:0.06 (± 0.04), RMSE:0.001 (± 0.01), R2 : 0.98 (± 0.05)), compared with LSTM (MAE:0.26 (± 0.19), MSE:0.31 (± 0.21), RMSE:0.14 (± 0.17), R2 : 0.56 (± 0.33)), InceptionTime (MAE: 0.19 (± 0.18), MSE: 0.22 (± 0.18), RMSE:0.08 (± 0.13), R2 :0.38 (± 1.35) ), ResNet (MAE:0.24 (± 0.16), MSE:0.28 (± 0.16), RMSE:0.11 (± 0.12), R2 :0.35 (± 1.19) ), XceptionTime (MAE:0.7 (± 0.55), MSE:0.79 (± 0.54), RMSE:0.91 (± 1.06), R2 : - 4.83 (± 9.38) ), and MiniRocket (MAE:0.21 (± 0.07), MSE:0.26 (± 0.08), RMSE:0.07 (± 0.04), R2 : 0.65 (± 0.28) ) to tackle this challenge. The transformer model is a powerful model for improving the accurate forecast of the NO2 levels and could strengthen the current monitoring system to control and manage the air quality in the region. Supplementary Information The online version contains supplementary material available at 10.1186/s40537-023-00754-z.
Collapse
Affiliation(s)
- Aamna AlShehhi
- Biomedical Engineering, Khalifa University, Abu Dhabi, United Arab Emirates
| | - Roy Welsch
- Sloan School of Management and Statistics, Massachusetts Institute of Technology, Cambridge, Massachusetts USA
| |
Collapse
|
19
|
Teng M, Li S, Xing J, Fan C, Yang J, Wang S, Song G, Ding Y, Dong J, Wang S. 72-hour real-time forecasting of ambient PM 2.5 by hybrid graph deep neural network with aggregated neighborhood spatiotemporal information. ENVIRONMENT INTERNATIONAL 2023; 176:107971. [PMID: 37220671 DOI: 10.1016/j.envint.2023.107971] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Revised: 04/05/2023] [Accepted: 05/08/2023] [Indexed: 05/25/2023]
Abstract
The observation-based air pollution forecasting method has high computational efficiency over traditional numerical models, but a poor ability in long-term (after 6 h) forecasting due to a lack of detailed representation of atmospheric processes associated with the pollution transport. To address such limitation, here we propose a novel real-time air pollution forecasting model that applies a hybrid graph deep neural network (GNN_LSTM) to dynamically capture the spatiotemporal correlations among neighborhood monitoring sites to better represent the physical mechanism of pollutant transport across the space with the graph structure which is established with features (angle, wind speed, and wind direction) of neighborhood sites to quantify their interactions. Such design substantially improves the model performance in 72-hour PM2.5 forecasting over the whole Beijing-Tianjin-Hebei region (overall R2 increases from 0.6 to 0.79), particularly for polluted episodes (PM2.5 concentration > 55 µg/m3) with pronounced regional transport to be captured by GNN_LSTM model. The inclusion of the AOD feature further enhances the model performance in predicting PM2.5 over the sites where the AOD can inform additional aloft PM2.5 pollution features related to regional transport. The importance of neighborhood site (particularly for those in the upwind flow pathway of the target area) features for long-term PM2.5 forecast is demonstrated by the increased performance in predicting PM2.5 in the target city (Beijing) with the inclusion of additional 128 neighborhood sites. Moreover, the newly developed GNN_LSTM model also implies the "source"-receptor relationship, as impacts from distanced sites associated with regional transport grow along with the forecasting time (from 0% to 38% in 72 h) following the wind flow. Such results suggest the great potential of GNN_LSTM in long-term air quality forecasting and air pollution prevention.
Collapse
Affiliation(s)
- Mengfan Teng
- Hubei Key Laboratory of Quantitative Remote Sensing of Land and Atmosphere, School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, China
| | - Siwei Li
- Hubei Key Laboratory of Quantitative Remote Sensing of Land and Atmosphere, School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, China; Hubei Luojia Laboratory, Wuhan University, Wuhan 430079, China.
| | - Jia Xing
- Department of Civil and Environmental Engineering, the University of Tennessee, Knoxville, TN 37996, USA
| | - Chunying Fan
- Hubei Key Laboratory of Quantitative Remote Sensing of Land and Atmosphere, School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, China
| | - Jie Yang
- Hubei Key Laboratory of Quantitative Remote Sensing of Land and Atmosphere, School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, China; Hubei Luojia Laboratory, Wuhan University, Wuhan 430079, China
| | - Shuo Wang
- School of Systems Science, Beijing Normal University, Beijing 100875, China
| | - Ge Song
- Hubei Key Laboratory of Quantitative Remote Sensing of Land and Atmosphere, School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, China
| | - Yu Ding
- Hubei Key Laboratory of Quantitative Remote Sensing of Land and Atmosphere, School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, China
| | - Jiaxin Dong
- Hubei Key Laboratory of Quantitative Remote Sensing of Land and Atmosphere, School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, China
| | - Shansi Wang
- Hubei Key Laboratory of Quantitative Remote Sensing of Land and Atmosphere, School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, China
| |
Collapse
|
20
|
Tao H, Jawad AH, Shather AH, Al-Khafaji Z, Rashid TA, Ali M, Al-Ansari N, Marhoon HA, Shahid S, Yaseen ZM. Machine learning algorithms for high-resolution prediction of spatiotemporal distribution of air pollution from meteorological and soil parameters. ENVIRONMENT INTERNATIONAL 2023; 175:107931. [PMID: 37119651 DOI: 10.1016/j.envint.2023.107931] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Revised: 03/18/2023] [Accepted: 04/11/2023] [Indexed: 05/22/2023]
Abstract
This study uses machine learning (ML) models for a high-resolution prediction (0.1°×0.1°) of air fine particular matter (PM2.5) concentration, the most harmful to human health, from meteorological and soil data. Iraq was considered the study area to implement the method. Different lags and the changing patterns of four European Reanalysis (ERA5) meteorological variables, rainfall, mean temperature, wind speed and relative humidity, and one soil parameter, the soil moisture, were used to select the suitable set of predictors using a non-greedy algorithm known as simulated annealing (SA). The selected predictors were used to simulate the temporal and spatial variability of air PM2.5 concentration over Iraq during the early summer (May-July), the most polluted months, using three advanced ML models, extremely randomized trees (ERT), stochastic gradient descent backpropagation (SGD-BP) and long short-term memory (LSTM) integrated with Bayesian optimizer. The spatial distribution of the annual average PM2.5 revealed the population of the whole of Iraq is exposed to a pollution level above the standard limit. The changes in temperature and soil moisture and the mean wind speed and humidity of the month before the early summer can predict the temporal and spatial variability of PM2.5 over Iraq during May-July. Results revealed the higher performance of LSTM with normalized root-mean-square error and Kling-Gupta efficiency of 13.4% and 0.89, compared to 16.02% and 0.81 for SDG-BP and 17.9% and 0.74 for ERT. The LSTM could also reconstruct the observed spatial distribution of PM2.5 with MapCurve and Cramer's V values of 0.95 and 0.91, compared to 0.9 and 0.86 for SGD-BP and 0.83 and 0.76 for ERT. The study provided a methodology for forecasting spatial variability of PM2.5 concentration at high resolution during the peak pollution months from freely available data, which can be replicated in other regions for generating high-resolution PM2.5 forecasting maps.
Collapse
Affiliation(s)
- Hai Tao
- School of Computer and Information, Qiannan Normal University for Nationalities, Duyun, Guizhou 558000, China; State Key Laboratory of Public Big Data, Guizhou University, Guizhou, Guiyang 550025, China; Institute for Big Data Analytics and Artificial Intelligence (IBDAAI), Universiti Teknologi MARA, 40450 Shah Alam, Selangor, Malaysia.
| | - Ali H Jawad
- Faculty of Applied Sciences, UniversitiTeknologi MARA, 40450 Shah Alam, Selangor, Malaysia.
| | - A H Shather
- Dep of Computer Technology Engineering, Engineering Technical College, University of Alkitab, Iraq.
| | - Zainab Al-Khafaji
- Department of Building and Construction Technologies Engineering, AL-Mustaqbal University College, Hillah 51001, Iraq.
| | - Tarik A Rashid
- Computer Science and Engineering Department, University of Kurdistan Hewler, Erbil, KR, Iraq.
| | - Mumtaz Ali
- UniSQ College, University of Southern Queensland, QLD 4350, Australia.
| | - Nadhir Al-Ansari
- Dept. of Civil, Environmental and Natural Resources Engineering, Lulea Univ. of Technology, Lulea T3334, Sweden.
| | - Haydar Abdulameer Marhoon
- Information and Communication Technology Research Group, Scientific Research Center, Al-Ayen University, Thi-Qar, Iraq; College of Computer Sciences and Information Technology, University of Kerbala, Karbala, Iraq.
| | - Shamsuddin Shahid
- Department of Hydraulics and Hydrology, School of Civil Engineering, Faculty of Engineering, Universiti Teknologi Malaysia (UTM), 81310 Skudia, Johor, Malaysia.
| | - Zaher Mundher Yaseen
- Civil and Environmental Engineering Department, King Fahd University of Petroleum & Minerals, Dhahran 31261, Saudi Arabia; Interdisciplinary Research Center for Membranes and Water Security, King Fahd University of Petroleum & Minerals, Dhahran 31261, Saudi Arabia.
| |
Collapse
|
21
|
Yang Y, Zhou G, Jiang B, Wang Q, Hu Y, Sun B. Pollution and occupational protection of diesel particulate matter in underground space. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2023; 30:45840-45858. [PMID: 36708480 DOI: 10.1007/s11356-023-25386-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Accepted: 01/14/2023] [Indexed: 06/18/2023]
Abstract
To address the diesel particulate matter pollution problem at the 12,306 continuous mining face of Shangwan coal mine, the spatial and temporal evolution law of diesel particulate matter generated at the three locations of the shuttle car head tunnel, contact alley, and support tunnel under the pressure-in ventilation condition of the double lane of the continuous mining face was studied by numerical simulation. The results show that the highest diesel particulate matter concentration at the shuttle car discharge is about 144.17 mg/m3, which seriously affects the health of miners. The highest diesel particulate matter concentration at the shuttle car tunnel is 52.58 mg/m3, and at the contact alley, the diesel particulate matter diffusion space is limited by the compression of the space inside the contact alley by the shuttle car machine body and the alley wall, which makes the diesel particulate matter accumulate here, forming a high diesel particulate matter concentration distribution area with a concentration value of 112.75 mg/m3. When supporting the roadway at the shuttle, diesel particulate matter accumulates in the range of X = 55 m ~ 60 m, Y = 0 m ~ 4 m, and Z = 23.4 m ~ 29.4 m. According to the degree of DPM pollution in different areas, different individual protective equipment is used to obtain different levels of pollution protection.
Collapse
Affiliation(s)
- Yang Yang
- College of Safety and Environmental Engineering, Shandong University of Science and Technology, Qingdao, 266590, China
- State Key Laboratory of Mining Disaster Prevention and Control Co-founded by Shandong Province and the Ministry of Science and Technology, Shandong University of Science and Technology, Qingdao, 266590, China
| | - Gang Zhou
- College of Safety and Environmental Engineering, Shandong University of Science and Technology, Qingdao, 266590, China
- State Key Laboratory of Mining Disaster Prevention and Control Co-founded by Shandong Province and the Ministry of Science and Technology, Shandong University of Science and Technology, Qingdao, 266590, China
| | - Bingyou Jiang
- Key Laboratory of Industrial Dust Prevention and Control & Occupational Safety and Health, Ministry of Education, Anhui University of Science & Technology, Huainan, 232001, China
| | - Qi Wang
- College of Safety and Environmental Engineering, Shandong University of Science and Technology, Qingdao, 266590, China
- State Key Laboratory of Mining Disaster Prevention and Control Co-founded by Shandong Province and the Ministry of Science and Technology, Shandong University of Science and Technology, Qingdao, 266590, China
| | - Yating Hu
- College of Safety and Environmental Engineering, Shandong University of Science and Technology, Qingdao, 266590, China
- State Key Laboratory of Mining Disaster Prevention and Control Co-founded by Shandong Province and the Ministry of Science and Technology, Shandong University of Science and Technology, Qingdao, 266590, China
| | - Biao Sun
- College of Safety and Environmental Engineering, Shandong University of Science and Technology, Qingdao, 266590, China.
- State Key Laboratory of Mining Disaster Prevention and Control Co-founded by Shandong Province and the Ministry of Science and Technology, Shandong University of Science and Technology, Qingdao, 266590, China.
| |
Collapse
|
22
|
Fan K, Dhammapala R, Harrington K, Lamb B, Lee Y. Machine learning-based ozone and PM2.5 forecasting: Application to multiple AQS sites in the Pacific Northwest. Front Big Data 2023; 6:1124148. [PMID: 36910164 PMCID: PMC9999009 DOI: 10.3389/fdata.2023.1124148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Accepted: 02/06/2023] [Indexed: 03/14/2023] Open
Abstract
Air quality in the Pacific Northwest (PNW) of the U.S has generally been good in recent years, but unhealthy events were observed due to wildfires in summer or wood burning in winter. The current air quality forecasting system, which uses chemical transport models (CTMs), has had difficulty forecasting these unhealthy air quality events in the PNW. We developed a machine learning (ML) based forecasting system, which consists of two components, ML1 (random forecast classifiers and multiple linear regression models) and ML2 (two-phase random forest regression model). Our previous study showed that the ML system provides reliable forecasts of O3 at a single monitoring site in Kennewick, WA. In this paper, we expand the ML forecasting system to predict both O3 in the wildfire season and PM2.5 in wildfire and cold seasons at all available monitoring sites in the PNW during 2017-2020, and evaluate our ML forecasts against the existing operational CTM-based forecasts. For O3, both ML1 and ML2 are used to achieve the best forecasts, which was the case in our previous study: ML2 performs better overall (R2 = 0.79), especially for low-O3 events, while ML1 correctly captures more high-O3 events. Compared to the CTM-based forecast, our O3 ML forecasts reduce the normalized mean bias (NMB) from 7.6 to 2.6% and normalized mean error (NME) from 18 to 12% when evaluating against the observation. For PM2.5, ML2 performs the best and thus is used for the final forecasts. Compared to the CTM-based PM2.5, ML2 clearly improves PM2.5 forecasts for both wildfire season (May to September) and cold season (November to February): ML2 reduces NMB (-27 to 7.9% for wildfire season; 3.4 to 2.2% for cold season) and NME (59 to 41% for wildfires season; 67 to 28% for cold season) significantly and captures more high-PM2.5 events correctly. Our ML air quality forecast system requires fewer computing resources and fewer input datasets, yet it provides more reliable forecasts than (if not, comparable to) the CTM-based forecast. It demonstrates that our ML system is a low-cost, reliable air quality forecasting system that can support regional/local air quality management.
Collapse
Affiliation(s)
- Kai Fan
- Center for Advanced Systems Understanding, Görlitz, Germany.,Helmholtz-Zentrum Dresden Rossendorf, Dresden, Germany.,Laboratory for Atmospheric Research, Department of Civil and Environmental Engineering, Washington State University, Pullman, WA, United States
| | - Ranil Dhammapala
- South Coast Air Quality Management District, Diamond Bar, CA, United States
| | | | - Brian Lamb
- Laboratory for Atmospheric Research, Department of Civil and Environmental Engineering, Washington State University, Pullman, WA, United States
| | - Yunha Lee
- Center for Advanced Systems Understanding, Görlitz, Germany.,Helmholtz-Zentrum Dresden Rossendorf, Dresden, Germany.,Laboratory for Atmospheric Research, Department of Civil and Environmental Engineering, Washington State University, Pullman, WA, United States
| |
Collapse
|
23
|
Xu H, Zhang A, Xu X, Li P, Ji Y. Prediction of Particulate Concentration Based on Correlation Analysis and a Bi-GRU Model. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2022; 19:13266. [PMID: 36293843 PMCID: PMC9603264 DOI: 10.3390/ijerph192013266] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/23/2022] [Revised: 10/08/2022] [Accepted: 10/11/2022] [Indexed: 06/16/2023]
Abstract
In recent decades, particulate pollution in the air has caused severe health problems. Therefore, it has become a hot research topic to accurately predict particulate concentrations. Particle concentration has a strong spatial-temporal correlation due to pollution transportation between regions, making it important to understand how to utilize these features to predict particulate concentration. In this paper, Pearson Correlation Coefficients (PCCs) are used to compare the particle concentrations at the target site with those at other locations. The models based on bi-directional gated recurrent units (Bi-GRUs) and PCCs are proposed to predict particle concentrations. The proposed model has the advantage of requiring fewer samples and can forecast particulate concentrations in real time within the next six hours. As a final step, several Beijing air quality monitoring stations are tested for pollutant concentrations hourly. Based on the correlation analysis and the proposed prediction model, the prediction error within the first six hours is smaller than those of the other three models. The model can help environmental researchers improve the prediction accuracy of fine particle concentrations and help environmental policymakers implement relevant pollution control policies by providing tools. With the correlation analysis between the target site and adjacent sites, an accurate pollution control decision can be made based on the internal relationship.
Collapse
Affiliation(s)
- He Xu
- School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing 210023, China
- Jiangsu HPC and Intelligent Processing Engineer Research Center, Nanjing 210003, China
| | - Aosheng Zhang
- School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing 210023, China
- Jiangsu HPC and Intelligent Processing Engineer Research Center, Nanjing 210003, China
| | - Xin Xu
- School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing 210023, China
- Jiangsu HPC and Intelligent Processing Engineer Research Center, Nanjing 210003, China
| | - Peng Li
- School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing 210023, China
- Jiangsu HPC and Intelligent Processing Engineer Research Center, Nanjing 210003, China
| | - Yimu Ji
- School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing 210023, China
- Jiangsu HPC and Intelligent Processing Engineer Research Center, Nanjing 210003, China
| |
Collapse
|
24
|
Pruthi D, Liu Y. Low-cost nature-inspired deep learning system for PM2.5 forecast over Delhi, India. ENVIRONMENT INTERNATIONAL 2022; 166:107373. [PMID: 35763992 DOI: 10.1016/j.envint.2022.107373] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/29/2022] [Revised: 06/21/2022] [Accepted: 06/22/2022] [Indexed: 06/15/2023]
Abstract
Air quality has a tremendous impact on India's health and prosperity. Air quality models are crucial tools for surveying and projecting air pollution episodes, which can be used to issue health advisories to take action ahead of time. Short-term increases in air pollution trigger many adverse health events; a fast, efficient, cost-effective, and reliable air quality prediction model would aid in minimizing the effect on health and prosperity. Deterministic models, on the other hand, are less robust in predicting the pollutant series since it is non-stationary and non-linear. Atmospheric chemistry models are computationally expensive and often rely on outdated emissions information. We propose a deep learning model in this study that integrates neural networks, fuzzy inference systems, and wavelet transforms to predict the most prominent air pollutant affecting Delhi, India i.e., PM2.5 (particulate matter of aerodynamic diameter less than or equal to 2.5 µm). We have included the main aspects of air quality models in this research i.e., less computational time (7 min approximately using I5-1035G1, 1.19 GHz processor), less resource-intensive (dependent only on the pollutant lagged values), and high spatial resolution (1 km) for forecasting air quality three days ahead. The model predictions show a significant correlation coefficient lying in [0.96,0.98], [0.86,0.93], and [0.82,0.91] with Central Pollution Control Board (CPCB) monitored data at various sites in Delhi for one, two, and three days of forecast respectively.
Collapse
Affiliation(s)
- D Pruthi
- Gangarosa Department of Environmental Health, Rollins School of Public Health, Emory University, Atlanta, GA 30322, USA
| | - Y Liu
- Gangarosa Department of Environmental Health, Rollins School of Public Health, Emory University, Atlanta, GA 30322, USA.
| |
Collapse
|
25
|
Machine Learning-Based Approach Using Open Data to Estimate PM2.5 over Europe. REMOTE SENSING 2022. [DOI: 10.3390/rs14143392] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Air pollution is currently considered one of the most serious problems facing humans. Fine particulate matter with a diameter smaller than 2.5 micrometres (PM2.5) is a very harmful air pollutant that is linked with many diseases. In this study, we created a machine learning-based scheme to estimate PM2.5 using various open data such as satellite remote sensing, meteorological data, and land variables to increase the limited spatial coverage provided by ground-monitors. A space-time extremely randomised trees model was used to estimate PM2.5 concentrations over Europe, this model achieved good results with an out-of-sample cross-validated R2 of 0.69, RMSE of 5 μg/m3, and MAE of 3.3 μg/m3. The outcome of this study is a daily full coverage PM2.5 dataset with 1 km spatial resolution for the three-year period of 2018–2020. We found that air quality improved throughout the study period over all countries in Europe. In addition, we compared PM2.5 levels during the COVID-19 lockdown during the months March–June with the average of the previous 4 months and the following 4 months. We found that this lockdown had a positive effect on air quality in most parts of the study area except for the United Kingdom, Ireland, north of France, and south of Italy. This is the first study that depends only on open data and covers the whole of Europe with high spatial and temporal resolutions. The reconstructed dataset will be published under free and open license and can be used in future air quality studies.
Collapse
|
26
|
Gul S, Khan GM, Yousaf S. Multi-step short-term
P
M
2.5
forecasting for enactment of proactive environmental regulation strategies. ENVIRONMENTAL MONITORING AND ASSESSMENT 2022; 194:386. [PMID: 35445884 PMCID: PMC9022063 DOI: 10.1007/s10661-022-10029-4] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Accepted: 04/05/2022] [Indexed: 06/14/2023]
Abstract
Particulate matter is one of the key contributors of air pollution and climate change. Long-term exposure to constituents of air pollutants has exerted serious health implications in both humans and plants leading to a detrimental impact on economy. Among the pollutants contributing to air quality determination, particulate matter has been linked to serious health implications causing pulmonary complications, cardiovascular diseases, growth retardation and ultimately death. In agriculture, crop yield is also negatively impacted by the deposition of particulate matter on stomata of the plant which is alarming and can cause food security concerns. The deleterious impact of air pollutants on human health, agricultural and economic well-being highlights the importance of quantifying and forecasting particulate matter. Several deterministic and deep learning models have been employed in the recent years to forecast the concentration of particulate matter. Among them, deep learning models have shown promising results when it comes to modeling time series data and forecasting it. We have explored recurrent neural networks with LSTM model which shows potential to predict the particulate matter (P M 2.5 ) based on multi-step multi-variate data of two of the most polluted regions of South Asia, Beijing, China and Punjab, Pakistan effectively. The LSTM model is tuned using Bayesian optimization technique to employ the appropriate hyper-parameters and weight initialization strategies based on the dataset. The model was able to predictP M 2.5 for the next hour with root-mean-square error (RMSE) of 0.1913 (91.5% accuracy) and this error gradually increases with the number of time steps with next 24 hours steps prediction having RMSE of 0.7290. While in case of Punjab dataset with data recorded once a day, the RMSE for the next day forecast is 0.2192. These multi-step short-term forecasts would play a pivotal role in establishing an early warning system based on the air quality index (AQI) calculated and enable the government in enacting policies to contain it.
Collapse
Affiliation(s)
- Saba Gul
- School Of Electrical Engineering and Computer Science, National University of Science and Technology, Islamabad, Pakistan
- National Center of Artificial Intelligence, University of Engineering and Technology, Peshawar, Pakistan
| | - Gul Muhammad Khan
- National Center of Artificial Intelligence, University of Engineering and Technology, Peshawar, Pakistan
| | - Sohail Yousaf
- National Center of Artificial Intelligence, University of Engineering and Technology, Peshawar, Pakistan
| |
Collapse
|
27
|
Yang L, Hong S, He C, Huang J, Ye Z, Cai B, Yu S, Wang Y, Wang Z. Spatio-Temporal Heterogeneity of the Relationships Between PM 2.5 and Its Determinants: A Case Study of Chinese Cities in Winter of 2020. Front Public Health 2022; 10:810098. [PMID: 35480572 PMCID: PMC9035510 DOI: 10.3389/fpubh.2022.810098] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2021] [Accepted: 03/21/2022] [Indexed: 11/17/2022] Open
Abstract
Fine particulate matter (PM2.5) poses threat to human health in China, particularly in winter. The pandemic of coronavirus disease 2019 (COVID-19) led to a series of strict control measures in Chinese cities, resulting in a short-term significant improvement in air quality. This is a perfect case to explore driving factors affecting the PM2.5 distributions in Chinese cities, thus helping form better policies for future PM2.5 mitigation. Based on panel data of 332 cities, we analyzed the function of natural and anthropogenic factors to PM2.5 pollution by applying the geographically and temporally weighted regression (GTWR) model. We found that the PM2.5 concentration of 84.3% of cities decreased after lockdown. Spatially, in the winter of 2020, cities with high PM2.5 concentrations were mainly distributed in Northeast China, the North China Plain and the Tarim Basin. Higher temperature, wind speed and relative humidity were easier to promote haze pollution in northwest of the country, where enhanced surface pressure decreased PM2.5 concentrations. Furthermore, the intensity of trip activities (ITAs) had a significant positive effect on PM2.5 pollution in Northwest and Central China. The number of daily pollutant operating vents of key polluting enterprises in the industrial sector (VOI) in northern cities was positively correlated with the PM2.5 concentration; inversely, the number of daily pollutant operating vents of key polluting enterprises in the power sector (VOP) imposed a negative effect on the PM2.5 concentration in these regions. This work provides some implications for regional air quality improvement policies of Chinese cities in wintertime.
Collapse
Affiliation(s)
- Lu Yang
- School of Resource and Environment Science, Wuhan University, Wuhan, China
| | - Song Hong
- School of Resource and Environment Science, Wuhan University, Wuhan, China
| | - Chao He
- College of Resources and Environment, Yangtze University, Wuhan, China
| | - Jiayi Huang
- Business School, The University of Sydney, Sydney, NSW, Australia
| | - Zhixiang Ye
- School of Resource and Environment Science, Wuhan University, Wuhan, China
| | - Bofeng Cai
- Center for Climate Change and Environmental Policy, Chinese Academy of Environmental Planning, Beijing, China
| | - Shuxia Yu
- College of Resource and Environment, Huazhong Agricultural University, Wuhan, China
| | - Yanwen Wang
- Economics and Management College, China University of Geosciences, Wuhan, China
| | - Zhen Wang
- College of Resource and Environment, Huazhong Agricultural University, Wuhan, China
| |
Collapse
|
28
|
Jeong D, Yoo C, Yeh SW, Yoon JH, Lee D, Lee JB, Choi JY. Statistical Seasonal Forecasting of Winter and Spring PM 2.5 Concentrations Over the Korean Peninsula. ASIA-PACIFIC JOURNAL OF ATMOSPHERIC SCIENCES 2022; 58:549-561. [PMID: 35371395 PMCID: PMC8960088 DOI: 10.1007/s13143-022-00275-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/15/2021] [Revised: 03/01/2022] [Accepted: 03/04/2022] [Indexed: 06/14/2023]
Abstract
UNLABELLED Concentrations of fine particulate matter smaller than 2.5 μm in diameter (PM2.5) over the Korean Peninsula experience year-to-year variations due to interannual variation in climate conditions. This study develops a multiple linear regression model based on slowly varying boundary conditions to predict winter and spring PM2.5 concentrations at 1-3-month lead times. Nation-wide observations of Korea, which began in 2015, is extended back to 2005 using the local Seoul government's observations, constructing a long-term dataset covering the 2005-2019 period. Using the forward selection stepwise regression approach, we identify sea surface temperature (SST), soil moisture, and 2-m air temperature as predictors for the model, while rejecting sea ice concentration and snow depth due to weak correlations with seasonal PM2.5 concentrations. For the wintertime (December-January-February, DJF), the model based on SSTs over the equatorial Atlantic and soil moisture over the eastern Europe along with the linear PM2.5 concentration trend generates a 3-month forecasts that shows a 0.69 correlation with observations. For the springtime (March-April-May, MAM), the accuracy of the model using SSTs over North Pacific and 2-m air temperature over East Asia increases to 0.75. Additionally, we find a linear relationship between the seasonal mean PM2.5 concentration and an extreme metric, i.e., seasonal number of high PM2.5 concentration days. SUPPLEMENTARY INFORMATION The online version contains supplementary material available at 10.1007/s13143-022-00275-4.
Collapse
Affiliation(s)
- Dajeong Jeong
- Department of Climate and Energy Systems Engineering, Ewha Womans University, 52 Ewhayeodae-gil, Seodaemun-gu, Seoul, South Korea
| | - Changhyun Yoo
- Department of Climate and Energy Systems Engineering, Ewha Womans University, 52 Ewhayeodae-gil, Seodaemun-gu, Seoul, South Korea
| | - Sang-Wook Yeh
- Department of Marine Sciences and Convergent Technology, Hanyang University ERICA, Ansan, South Korea
| | - Jin-Ho Yoon
- School of Earth Sciences and Environmental Engineering, Gwangju Institute of Science and Technology, Gwangju, South Korea
| | - Daegyun Lee
- Air Quality Forecasting Center, National Institute of Environmental Research, Incheon, South Korea
| | - Jae-Bum Lee
- Air Quality Forecasting Center, National Institute of Environmental Research, Incheon, South Korea
| | - Jin-Young Choi
- Air Quality Forecasting Center, National Institute of Environmental Research, Incheon, South Korea
| |
Collapse
|
29
|
Attention-Based Distributed Deep Learning Model for Air Quality Forecasting. SUSTAINABILITY 2022. [DOI: 10.3390/su14063269] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Air quality forecasting has become an essential factor in facilitating sustainable development worldwide. Several countries have implemented monitoring stations to collect air pollution particle data and meteorological information using parameters such as hourly timespans. This research focuses on unravelling a new framework for air quality prediction worldwide and features Busan, South Korea as its model city. The paper proposes the application of an attention-based convolutional BiLSTM autoencoder model. The proposed deep learning model has been trained on a distributed framework, referred to data parallelism, to forecast the intensity of particle pollution (PM2.5 and PM10). The algorithm automatically learns the intrinsic correlation among the particle pollution in different locations. Each location’s meteorological and traffic data is extensively exploited to improve the model’s performance. The model has been trained using air quality particle data and car traffic information. The traffic information is obtained by a device which counts cars passing a specific area through the YOLO algorithm, and then sends the data to a stacked deep autoencoder to be encoded alongside the meteorological data before the final prediction. In addition, multiple one-dimensional CNN layers are used to obtain the local spatial features jointly with a stacked attention-based BiLSTM layer to figure out how air quality particles are correlated in space and time. The evaluation of the new attention-based convolutional BiLSTM autoencoder model was derived from data collected and retrieved from comprehensive experiments conducted in South Korea. The results not only show that the framework outperforms the previous models both on short- and long-term predictions but also indicate that traffic information can improve the accuracy of air quality forecasting. For instance, during PM2.5 prediction, the proposed attention-based model obtained the lowest MAE (5.02 and 22.59, respectively, for short-term and long-term prediction), RMSE (7.48 and 28.02) and SMAPE (17.98 and 39.81) among all the models, which indicates strong accuracy between observed and predicted values. It was also found that the newly proposed model had the lowest average training time compared to the baseline algorithms. Furthermore, the proposed framework was successfully deployed in a cloud server in order to provide future air quality information in real time and when needed.
Collapse
|
30
|
A Machine Learning-Based Ensemble Framework for Forecasting PM2.5 Concentrations in Puli, Taiwan. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12052484] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Forecasting of PM2.5 concentration is a global concern. Evidence has shown that the ambient PM2.5 concentrations are harmful to human health, climate change, plant species mortality, etc. PM2.5 concentrations are caused by natural and anthropogenic activities, and it is challenging to predict them due to many uncertain factors. Current research has focused on developing a new model while overlooking the fact that every single model for PM2.5 prediction has its own strengths and weaknesses. This paper proposes an ensemble framework which combines four diverse learning models for PM2.5 forecasting in Puli, Taiwan. It explores the synergy between parametric and non-parametric learning, and short-term and long-term learning. The feature set covers periodic, meteorological, and autoregression variables which are selected by a spiral validation process. The experimental dataset, spanning from 1 January 2008 to 31 December 2019, from Puli Township in central Taiwan, is used in this study. The experimental results show the proposed multi-model framework can synergize the advantages of the embedded models and obtain an improved forecasting result. Further, the benefit obtained by blending short-term learning with long-term learning is validated, in surpassing the performance obtained by using just single type of learning. Our multi-model framework compares favorably with deep-learning models on Puli dataset. It also shows high adaptivity, such that our multi-model framework is comparable to the leading methods for PM2.5 forecasting in Delhi, India.
Collapse
|
31
|
Abstract
This paper implements deep learning methods of recurrent neural networks and short-term memory models. Two kinds of time-series data were used: air pollutant factors, such as O3, SO2, and CO2 from 2017 to 2019, and meteorological factors such as temperature, humidity, wind direction, and wind speed. A trained model was used to predict air pollution within an eight-hour period. Correlation analysis was applied using Pearson and Spearman correlation coefficients. The KNN method was used to fill in the missing values to improve the generated model’s accuracy. The average absolute error percentage value was used in the experiments to evaluate the model’s performance. LSTM had the lowest RMSE value at 1.9 than the other models from the experiments. CNN had a significant RMSE value at 3.5, followed by Bi-LSTM at 2.5 and Bi-GRU at 2.7. In comparison, the RNN was slightly higher than LSTM at a 2.4 value.
Collapse
|
32
|
Chen CWS, Chiu LM. Ordinal Time Series Forecasting of the Air Quality Index. ENTROPY 2021; 23:e23091167. [PMID: 34573792 PMCID: PMC8469594 DOI: 10.3390/e23091167] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Revised: 08/31/2021] [Accepted: 09/01/2021] [Indexed: 11/16/2022]
Abstract
This research models and forecasts daily AQI (air quality index) levels in 16 cities/counties of Taiwan, examines their AQI level forecast performance via a rolling window approach over a one-year validation period, including multi-level forecast classification, and measures the forecast accuracy rates. We employ statistical modeling and machine learning with three weather covariates of daily accumulated precipitation, temperature, and wind direction and also include seasonal dummy variables. The study utilizes four models to forecast air quality levels: (1) an autoregressive model with exogenous variables and GARCH (generalized autoregressive conditional heteroskedasticity) errors; (2) an autoregressive multinomial logistic regression; (3) multi-class classification by support vector machine (SVM); (4) neural network autoregression with exogenous variable (NNARX). These models relate to lag-1 AQI values and the previous day’s weather covariates (precipitation and temperature), while wind direction serves as an hour-lag effect based on the idea of nowcasting. The results demonstrate that autoregressive multinomial logistic regression and the SVM method are the best choices for AQI-level predictions regarding the high average and low variation accuracy rates.
Collapse
|