1
|
Garbagna L, Babu Saheer L, Maktab Dar Oghaz M. AI-driven approaches for air pollution modelling: A comprehensive systematic review. ENVIRONMENTAL POLLUTION (BARKING, ESSEX : 1987) 2025; 373:125937. [PMID: 40058557 DOI: 10.1016/j.envpol.2025.125937] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/14/2024] [Revised: 02/04/2025] [Accepted: 02/25/2025] [Indexed: 03/28/2025]
Abstract
In recent years, air quality levels have become a global issue with the rise of harmful pollutants and their effects on climate change. Urban areas are especially affected by air pollution, resulting in a deterioration of the environment and a surge in health complications. Research has been conducted on different studies that accurately predict future pollution concentration levels utilising different methods. This paper introduces the current physical models for air quality prediction and conducts an extensive systematic literature review on Machine Learning and Deep Learning techniques for predicting pollutants. This work compares different methodologies and techniques by grouping studies that utilise similar approaches together and comparing them. Furthermore, a distinction is made between temporal and spatiotemporal models to understand and highlight how both approaches impact future air pollutant concentration level predictions. The review differs from similar works as it focuses not only on comparing models and approaches but by analysing how the usage of external features, such as meteorological data, traffic information, and land usage, affect pollutant levels and the model's accuracy on air quality forecasting. Performances and limitations are explored for both Machine and Deep Learning approaches, and the work offers a discussion on their comparison and possible future developments in this research space. This review highlights how Deep Learning models tend to be more suitable for forecasting problems due to their feature and spatio-temporal correlation representation abilities, as well as providing different directions for further work, from models utilisation to feature inclusion.
Collapse
Affiliation(s)
- Lorenzo Garbagna
- Anglia Ruskin University, East Road, Cambridge, CB1 1PT, Cambridgeshire, United Kingdom.
| | - Lakshmi Babu Saheer
- Anglia Ruskin University, East Road, Cambridge, CB1 1PT, Cambridgeshire, United Kingdom
| | | |
Collapse
|
2
|
Yuan X, Hong X, Huang Z, Sheng L, Zhang J, Chen D, Zhong Z, Wang B, Zheng J. Uncovering key sources of regional ozone simulation biases using machine learning and SHAP analysis. ENVIRONMENTAL POLLUTION (BARKING, ESSEX : 1987) 2025; 372:126012. [PMID: 40057169 DOI: 10.1016/j.envpol.2025.126012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/11/2024] [Revised: 02/15/2025] [Accepted: 03/05/2025] [Indexed: 03/14/2025]
Abstract
Atmospheric chemical transport models (CTMs) are widely used in air quality management, but still have large biases in simulations. Accurately and efficiently identifying key sources of simulation biases is crucial for model improvement. However, traditional approaches, such as sensitivity and uncertainty analyses, are computationally intensive and inefficient, as they require numerous model runs. In this study, we explored the use of machine learning, specifically XGBoost combined with SHAP analysis, as an efficient diagnostic tool for analyzing simulation biases, focusing on ozone modeling in Guangdong Province as a case study. We used the bias of model inputs as features and excluded a dataset that was more susceptible to observational uncertainties to better target bias sources. Results reveal that biases in concentrations of NO2, NO and PM2.5, temperature and biogenic emissions are important sources that lead to O3 simulation biases. Notably, NOx emissions were identified as the primary cause, particularly in VOC-limited regimes during autumn and winter. Additionally, underestimated NOx emissions caused the model to misrepresent the NO2-O3 relationship, leading to an underestimation of the spatial extent of VOC-limited regimes in the PRD. This study demonstrates that enhancing NOx emission estimates reduces O3 simulation biases in the PRD by 34% and enhances the representation of the NO2-O3 relationship.
Collapse
Affiliation(s)
- Xin Yuan
- College of Environment and Climate, Institute for Environmental and Climate Research, Jinan University, Guangzhou, 511443, China
| | - Xinlong Hong
- College of Environment and Climate, Institute for Environmental and Climate Research, Jinan University, Guangzhou, 511443, China
| | - Zhijiong Huang
- College of Environment and Climate, Institute for Environmental and Climate Research, Jinan University, Guangzhou, 511443, China.
| | - Li Sheng
- College of Environment and Climate, Institute for Environmental and Climate Research, Jinan University, Guangzhou, 511443, China
| | - Jinlong Zhang
- College of Environment and Climate, Institute for Environmental and Climate Research, Jinan University, Guangzhou, 511443, China
| | - Duohong Chen
- Guangdong Ecological Environment Monitoring Center, Guangzhou, 510308, China
| | - Zhuangmin Zhong
- Guangdong Ecological Environment Monitoring Center, Guangzhou, 510308, China
| | - Boguang Wang
- College of Environment and Climate, Institute for Environmental and Climate Research, Jinan University, Guangzhou, 511443, China
| | - Junyu Zheng
- Sustainable Energy and Environmental Thrust, The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, 511458, China
| |
Collapse
|
3
|
Uche-Soria M, Tabuenca B, Halcón-Gibert G, Núñez-Guerrero Y. Quantifying and Forecasting Emission Reductions in Urban Mobility: An IoT-Driven Bike-Sharing Analysis. SENSORS (BASEL, SWITZERLAND) 2025; 25:2163. [PMID: 40218675 PMCID: PMC11991179 DOI: 10.3390/s25072163] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/24/2025] [Revised: 03/23/2025] [Accepted: 03/26/2025] [Indexed: 04/14/2025]
Abstract
The growing urgency to address urban air quality and climate change has intensified the need for sustainable mobility solutions that mitigate vehicular emissions. Bike-sharing systems (BSSs) represent a viable alternative; however, their precise environmental impact remains insufficiently explored. This study quantifies and forecasts reductions in CO2 and NOx emissions resulting from BSS usage in Madrid by integrating real-time IoT sensor data with an advanced predictive model. The proposed framework effectively captures nonlinear and seasonal mobility and emission patterns, achieving high predictive accuracy while demonstrating significant energy savings. These findings confirm the environmental benefits of BSSs and provide urban planners and policymakers with a robust tool to extend and replicate this analysis in other cities, fostering sustainable urban mobility and improved air quality.
Collapse
Affiliation(s)
- Manuel Uche-Soria
- Department of Engineering Organization, Business Administration and Statistics, Universidad Politécnica de Madrid, 28006 Madrid, Spain; (M.U.-S.)
| | - Bernardo Tabuenca
- Department of Computer Systems, Universidad Politécnica de Madrid, 28031 Madrid, Spain
| | - Gonzalo Halcón-Gibert
- Department of Engineering Organization, Business Administration and Statistics, Universidad Politécnica de Madrid, 28006 Madrid, Spain; (M.U.-S.)
| | - Yilsy Núñez-Guerrero
- Department of Engineering Organization, Business Administration and Statistics, Universidad Politécnica de Madrid, 28006 Madrid, Spain; (M.U.-S.)
| |
Collapse
|
4
|
Rad AK, Nematollahi MJ, Pak A, Mahmoudi M. Predictive modeling of air quality in the Tehran megacity via deep learning techniques. Sci Rep 2025; 15:1367. [PMID: 39779721 PMCID: PMC11711626 DOI: 10.1038/s41598-024-84550-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2024] [Accepted: 12/24/2024] [Indexed: 01/11/2025] Open
Abstract
Air pollution is a significant challenge in metropolitan areas, where increasing amounts of air pollutants threaten public health and environmental safety. The present study aims to forecast the concentrations of various air pollutants, including CO, O3, NO2, SO2, PM10, and PM2.5, from 2013 to 2023 in the Tehran megacity, Iran, via deep learning (DL) models and evaluate their effectiveness over conventional machine learning (ML) methods. Key driving variables, including temperature, relative humidity, dew point, wind speed, and air pressure, were considered. R-squared (R2), root-mean-square error (RMSE), mean absolute error (MAE), and mean-square error (MSE) were used to assess and compare the models. This research demonstrated that DL models typically outperform ML models in forecasting air pollution. Gated recurrent units (GRUs), fully connected neural networks (FCNNs), and convolutional neural networks (CNNs) recorded R2 and MSE values of 0.5971 and 42.11 for CO, 0.7873 and 171.40 for O3, and 0.4954 and 25.17 for SO2, respectively. Consequently, the FCNN and GRU presented remarkable performance in predicting NO2 (R2 = 0.6476 and MSE = 75.16), PM10 (R2 = 0.8712 and MSE = 45.11), and PM2.5 (R2 = 0.9276 and MSE = 58.12) concentrations. In terms of operational speed, the FCNN model exhibited the most efficiency, with a minimum and maximum runtime of 13 and 28 s, respectively. The feature importance analysis suggested that CO, O3 and NO2, SO2 and PM10, and PM2.5 are most affected by temperature, humidity, PM2.5, and PM10, respectively. Thus, temperature and humidity were the primary factors affecting the variability in pollutant concentrations. The conclusions confirm that the DL models achieve significant accuracy and serve as essential instruments for managing air pollution, providing practical insights for decision-makers to adopt efficient air quality control strategies.
Collapse
Affiliation(s)
- Abdullah Kaviani Rad
- Department of Environmental Engineering and Natural Resources, College of Agriculture, Shiraz University, Shiraz, 71946-85111, Iran
| | | | - Abbas Pak
- Department of Computer Sciences, Shahrekord University, Shahrekord, Iran
| | - Mohammadreza Mahmoudi
- Department of Statistics, Faculty of Science, Fasa University, Fasa, 74616-86131, Iran
| |
Collapse
|
5
|
Wu Y, Wang X, Wang M, Liu X, Zhu S. Time-Series Forecasting of PM 2.5 and PM 10 Concentrations Based on the Integration of Surveillance Images. SENSORS (BASEL, SWITZERLAND) 2024; 25:95. [PMID: 39796885 PMCID: PMC11722996 DOI: 10.3390/s25010095] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/20/2024] [Revised: 12/19/2024] [Accepted: 12/22/2024] [Indexed: 01/13/2025]
Abstract
Accurate and timely air quality forecasting is crucial for mitigating pollution-related hazards and protecting public health. Recently, there has been a growing interest in integrating visual data for air quality prediction. However, some limitations remain in existing literature, such as their focus on coarse-grained classification, single-moment estimation, or reliance on indirect and unintuitive information from visual images. Here we present a dual-channel deep learning model, integrating surveillance images and multi-source numerical data for air quality forecasting. Our model, which combines a single-channel hybrid network consisting of VGG16 and LSTM (named VGG16-LSTM) with a single-channel Long Short-Term Memory (LSTM) network, efficiently captures detailed spatiotemporal features from surveillance image sequences and temporal features from atmospheric, meteorological, and temporal data, enabling accurate time-series forecasting of PM2.5 and PM10 concentrations. Experiments conducted on the 2021 Shanghai dataset demonstrate that the proposed model significantly outperforms traditional machine learning methods in terms of accuracy and robustness for time-series forecasting, achieving R2 values of 0.9459 and 0.9045 and RMSE values of 4.79 μg/m3 and 11.51 μg/m3 for PM2.5 and PM10, respectively. Furthermore, validation results on the datasets from two stations in Kaohsiung, Taiwan, with average R2 values of 0.9728 and 0.9365 and average RMSE values of 1.89 μg/m3 and 5.69 μg/m3 for PM2.5 and PM10 using a pretrain-finetune training strategy, confirm the model's adaptability across diverse geographical contexts. These findings highlight the potential of integrating surveillance images to enhance air quality prediction, offering an effective supplement to ground-level environmental monitoring. Future work will focus on expanding datasets and optimizing network architectures to further improve forecasting accuracy and computational efficiency, enhancing the model's scalability for broader regional air quality management.
Collapse
Affiliation(s)
- Yong Wu
- School of Geographical Sciences, Fujian Normal University, Fuzhou 350117, China;
| | - Xiaochu Wang
- Shanghai Surveying and Mapping Institute, Shanghai 200063, China
- School of Geography, Nanjing Normal University, Nanjing 210023, China; (M.W.); (X.L.)
| | - Meizhen Wang
- School of Geography, Nanjing Normal University, Nanjing 210023, China; (M.W.); (X.L.)
| | - Xuejun Liu
- School of Geography, Nanjing Normal University, Nanjing 210023, China; (M.W.); (X.L.)
| | - Sifeng Zhu
- Shanghai Institute of Satellite Engineering, Shanghai 201109, China
| |
Collapse
|
6
|
Fournier C, Fernandez-Fernandez R, Cirés S, López-Orozco JA, Besada-Portas E, Quesada A. LSTM networks provide efficient cyanobacterial blooms forecasting even with incomplete spatio-temporal data. WATER RESEARCH 2024; 267:122553. [PMID: 39388977 DOI: 10.1016/j.watres.2024.122553] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/10/2024] [Revised: 09/21/2024] [Accepted: 09/28/2024] [Indexed: 10/12/2024]
Abstract
Cyanobacteria are the most frequent dominant species of algal blooms in inland waters, threatening ecosystem function and water quality, especially when toxin-producing strains predominate. Enhanced by anthropogenic activities and global warming, cyanobacterial blooms are expected to increase in frequency and global distribution. Early Warning Systems (EWS) for cyanobacterial blooms development allow timely implementation of management measures, reducing the risks associated to these blooms. In this paper, we propose an effective EWS for cyanobacterial bloom forecasting, which uses 6 years of incomplete high-frequency spatio-temporal data from multiparametric probes, including phycocyanin (PC) fluorescence as a proxy for cyanobacteria. A probe agnostic and replicable method is proposed to pre-process the data and to generate time series specific for cyanobacterial bloom forecasting. Using these pre-processed data, six different non-site/species-specific predictive models were compared including the autoregressive and multivariate versions of Linear Regression, Random Forest, and Long-Term Short-Term (LSTM) neural networks. Results were analyzed for seven forecasting time horizons ranging from 4 to 28 days evaluated with a hybrid system that combined regression metrics (MSE, R2, MAPE) for PC values, classification metrics (Accuracy, F1, Kappa) for a proposed alarm level of 10 µg PC/L, and a forecasting-specific metric to measure prediction improvement over the displaced signal (skill). The multivariate version of LSTM showed the best and most consistent results across all forecasting horizons and metrics, achieving accuracies of up to 90 % in predicting the proposed PC alarm level. Additionally, positive skill values indicated its outstanding effectiveness to forecast cyanobacterial blooms from 16 to 28 days in advance.
Collapse
Affiliation(s)
- Claudia Fournier
- Departamento de Biología, Universidad Autónoma de Madrid, 28049 Madrid, Spain
| | - Raúl Fernandez-Fernandez
- Departamento de Arquitectura de Computadores y Automática, Universidad Complutense de Madrid, 28040 Madrid, Spain
| | - Samuel Cirés
- Departamento de Biología, Universidad Autónoma de Madrid, 28049 Madrid, Spain
| | - José A López-Orozco
- Departamento de Arquitectura de Computadores y Automática, Universidad Complutense de Madrid, 28040 Madrid, Spain
| | - Eva Besada-Portas
- Departamento de Arquitectura de Computadores y Automática, Universidad Complutense de Madrid, 28040 Madrid, Spain
| | - Antonio Quesada
- Departamento de Biología, Universidad Autónoma de Madrid, 28049 Madrid, Spain.
| |
Collapse
|
7
|
Wang S, Sun Y, Gu H, Cao X, Shi Y, He Y. A deep learning model integrating a wind direction-based dynamic graph network for ozone prediction. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 946:174229. [PMID: 38917895 DOI: 10.1016/j.scitotenv.2024.174229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Revised: 06/11/2024] [Accepted: 06/21/2024] [Indexed: 06/27/2024]
Abstract
Ozone pollution is an important environmental issue in many countries. Accurate forecasting of ozone concentration enables relevant authorities to enact timely policies to mitigate adverse impacts. This study develops a novel hybrid deep learning model, named wind direction-based dynamic spatio-temporal graph network (WDDSTG-Net), for hourly ozone concentration prediction. The model uses a dynamic directed graph structure based on hourly changing wind direction data to capture evolving spatial relationships between air quality monitoring stations. It applied the graph attention mechanism to compute dynamic weights between connected stations, thereby aggregating neighborhood information adaptively. For temporal modeling, it utilized a sequence-to-sequence model with attention mechanism to extract long-range temporal dependencies. Additionally, it integrated meteorological predictions to guide the ozone forecasting. The model achieves a mean absolute error of 6.69 μg/m3 and 18.63 μg/m3 for 1-h prediction and 24-h prediction, outperforming several classic models. The model's IAQI accuracy predictions at all stations are above 75 %, with a maximum of 81.74 %. It also exhibits strong capabilities in predicting severe ozone pollution events, with a 24-h true positive rate of 0.77. Compared to traditional static graph models, WDDSTG-Net demonstrates the importance of incorporating short-term wind fluctuations and transport dynamics for data-driven air quality modeling. In principle, it may serve as an effective data-driven approach for the concentration prediction of other airborne pollutants.
Collapse
Affiliation(s)
- Shiyi Wang
- College of Chemical and Biological Engineering, Zhejiang University, Hangzhou 310058, China
| | - Yiming Sun
- College of Chemical and Biological Engineering, Zhejiang University, Hangzhou 310058, China
| | - Haonan Gu
- College of Chemical and Biological Engineering, Zhejiang University, Hangzhou 310058, China
| | - Xiaoyong Cao
- College of Chemical and Biological Engineering, Zhejiang University, Hangzhou 310058, China; Institute of Zhejiang University-Quzhou, Quzhou 324000, China
| | - Yao Shi
- College of Chemical and Biological Engineering, Zhejiang University, Hangzhou 310058, China
| | - Yi He
- College of Chemical and Biological Engineering, Zhejiang University, Hangzhou 310058, China; Institute of Zhejiang University-Quzhou, Quzhou 324000, China; Department of Chemical Engineering, University of Washington, Seattle 98915, USA.
| |
Collapse
|
8
|
Rautela KS, Goyal MK. Transforming air pollution management in India with AI and machine learning technologies. Sci Rep 2024; 14:20412. [PMID: 39223178 PMCID: PMC11369276 DOI: 10.1038/s41598-024-71269-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Accepted: 08/26/2024] [Indexed: 09/04/2024] Open
Abstract
A comprehensive approach is essential in India's ongoing battle against air pollution, combining technological advancements, regulatory reinforcement, and widespread societal engagement. Bridging technological gaps involves deploying sophisticated pollution control technologies and addressing the rural-urban disparity through innovative solutions. The review found that integrating Artificial Intelligence and Machine Learning (AI&ML) in air quality forecasting demonstrates promising results with a remarkable model efficiency. In this study, initially, we compute the PM2.5 concentration over India using a surface mass concentration of 5 key aerosols such as black carbon (BC), dust (DU), organic carbon (OC), sea salt (SS) and sulphates (SU), respectively. The study identifies several regions highly vulnerable to PM2.5 pollution due to specific sources. The Indo-Gangetic Plains are notably impacted by high concentrations of BC, OC, and SU resulting from anthropogenic activities. Western India experiences higher DU concentrations due to its proximity to the Sahara Desert. Additionally, certain areas in northeast India show significant contributions of OC from biogenic activities. Moreover, an AI&ML model based on convolutional autoencoder architecture underwent rigorous training, testing, and validation to forecast PM2.5 concentrations across India. The results reveal its exceptional precision in PM2.5 prediction, as demonstrated by model evaluation metrics, including a Structural Similarity Index exceeding 0.60, Peak Signal-to-Noise Ratio ranging from 28-30 dB and Mean Square Error below 10 μg/m3. However, regulatory challenges persist, necessitating robust frameworks and consistent enforcement mechanisms, as evidenced by the complexities in predicting PM2.5 concentrations. Implementing tailored regional pollution control strategies, integrating AI&ML technologies, strengthening regulatory frameworks, promoting sustainable practices, and encouraging international collaboration are essential policy measures to mitigate air pollution in India.
Collapse
Affiliation(s)
- Kuldeep Singh Rautela
- Department of Civil Engineering, Indian Institute of Technology Indore, Simrol, Indore, 453552, Madhya Pradesh, India
| | - Manish Kumar Goyal
- Department of Civil Engineering, Indian Institute of Technology Indore, Simrol, Indore, 453552, Madhya Pradesh, India.
| |
Collapse
|
9
|
Nguyen DH, Liao CH, Bui XT, Wang LC, Yuan CS, Lin C. Deseasonalized trend of ground-level ozone and its precursors in an industrial city Kaohsiung, Taiwan. ENVIRONMENTAL POLLUTION (BARKING, ESSEX : 1987) 2024; 351:124036. [PMID: 38677459 DOI: 10.1016/j.envpol.2024.124036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/28/2023] [Revised: 04/17/2024] [Accepted: 04/22/2024] [Indexed: 04/29/2024]
Abstract
Mitigating ground-level ozone (GLO) remains challenging due to its highly nonlinear formation process. Thus, understanding GLO pollution trends is crucial for developing effective control strategies, especially Kaohsiung industrial city, Taiwan. Based on the long-term monitoring data set of 2011-2022, temporal analysis reveals that monthly mean GLO peaks in autumn (40.66 ± 5.10 ppb), carbon monoxide (CO) and major precursors such as nitrogen oxides (NOx), nonmethane hydrocarbons (NMHC) reach their highest levels in winter. The distinct seasonal variation of air pollutants in Kaohsiung is primarily influenced by the unique blocking effect of the mountainous area under the northeasterly wind, as the city is situated downwind, causing high GLO levels during autumn due to the accumulation of stagnant air hindering the dispersion of pollutants. Over the 12 years (2011-2022), the deseasonalized trend analysis was conducted with p < 0.001, revealing a stabilization trend of GLO (+0.04 ppb/yr) from a previous sharp increase. The observed improvement is credited to a drastic decrease in total oxidants (Ox) at -0.63 ppb/yr due to significantly reducing their precursors. Furthermore, the effectiveness of precursor reduction is also supported by GLO daily maximum profile changes. While high GLO events (>120 ppb) decrease, days within midrange (60-80 ppb) rise from 24.4% to 33.3%. A notable difference emerges when comparing daytime and nighttime GLO. While daytime GLO decreased at -0.22 ppb/yr, nighttime GLO increased at +0.34 ppb/yr. Weakened nocturnal titration effects accounted for the nighttime increase. The distinct spatial variations in GLO trends on a citywide scale underscore that areas with complicated industrial activities may not benefit from a continuing reduction of precursors compared to less-polluted areas. The findings of this study hold significant implications for improving GLO control strategies in heavily industrialized city and provide valuable information to the general public about the current state of GLO pollution.
Collapse
Affiliation(s)
- Duy-Hieu Nguyen
- Program in Maritime Science and Technology, College of Maritime, National Kaohsiung University of Science and Technology, Kaohsiung, 811213, Taiwan
| | - Chih-Hsiang Liao
- Department of Environmental Engineering and Science, Chia-Nan University of Pharmacy and Science, Tainan, 71710, Taiwan
| | - Xuan-Thanh Bui
- Key Laboratory of Advanced Waste Treatment Technology & Faculty of Environment and Natural Resources, Ho Chi Minh City University of Technology (HCMUT), 268 Ly Thuong Kiet Street, District 10, Ho Chi Minh City, 700000, Viet Nam; Vietnam National University Ho Chi Minh City (VNU-HCM), Linh Trung ward, Ho Chi Minh City, 700000, Viet Nam
| | - Lin-Chi Wang
- Department of Marine Environmental Engineering, National Kaohsiung University of Science and Technology, Kaohsiung, 811213, Taiwan
| | - Chung-Shin Yuan
- Institute of Environmental Engineering, National Sun Yat-Sen University, Kaohsiung, 80424, Taiwan
| | - Chitsan Lin
- Program in Maritime Science and Technology, College of Maritime, National Kaohsiung University of Science and Technology, Kaohsiung, 811213, Taiwan; Department of Marine Environmental Engineering, National Kaohsiung University of Science and Technology, Kaohsiung, 811213, Taiwan.
| |
Collapse
|
10
|
Shi G, Leung Y, Zhang J, Zhou Y. Modeling the air pollution process using a novel multi-site and multi-scale method with adaptive utilization of spatio-temporal information. CHEMOSPHERE 2024; 349:140799. [PMID: 38052313 DOI: 10.1016/j.chemosphere.2023.140799] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Revised: 11/15/2023] [Accepted: 11/22/2023] [Indexed: 12/07/2023]
Abstract
This study focuses on modeling air quality with an adaptive utilization of spatio-temporal information from multiple air quality monitoring stations under a multi-scale framework. To this end, it is necessary to consider different strategies to combine different methods to decompose the given series and to fuse multi-site information. Based on a systematic comparative analysis, we propose a novel multi-scale and multi-site modeling method named the multivariate empirical mode decomposition and spatial cosine-attention-based long short-term memory (MEMD-SCA-LSTM). The MEMD-SCA-LSTM first employs MEMD to decompose the multi-site air quality series into the scale-aligned components and then models the components at different scales. The high-frequency components are modeled by a novel SCA-LSTM, which employs LSTM with residual blocks to extract the temporal information and utilizes an attention mechanism based on the cosine similarity to adaptively extract interactions among different sites. Other relatively regular components are modeled by the LSTM. Empirical study on PM2.5 in Hong Kong has demonstrated the effectiveness of fusing multi-site information using the spatial attention (SA) mechanism under the multi-scale framework with MEMD. The proposed MEMD-SCA-LSTM can improve the one-day ahead modeling performance with the mean absolute error and the root mean square error reduced over 10%, compared to the baseline modeling methods. For the two-day and three-day ahead performance, the MEMD-SCA-LSTM is still the best one. Furthermore, by visualizing the attention weights, we illustrate that our proposed SCA-LSTM can overcome some limitations of the traditional attention mechanisms and that the attention weights exhibit more informative patterns which could be used to analysis the transport of air pollutant between sites. The proposed modeling method is a general method, which is feasible and applicable to other pollutants in other cities or regions.
Collapse
Affiliation(s)
- Guang Shi
- School of Mathematics and Statistics, Xi'an Jiaotong University, Xi'an, 710049, Shaanxi, China; School of Computer Science, Xi'an Polytechnic University, Xi'an, 710048, Shaanxi, China
| | - Yee Leung
- Institute of Future Cities, The Chinese University of Hong Kong, Shatin, Hong Kong, China; Department of Geography and Resource Management, The Chinese University of Hong Kong, Shatin, Hong Kong, China
| | - Jiangshe Zhang
- School of Mathematics and Statistics, Xi'an Jiaotong University, Xi'an, 710049, Shaanxi, China
| | - Yu Zhou
- Institute of Future Cities, The Chinese University of Hong Kong, Shatin, Hong Kong, China; School of Urban & Regional Science and Institute for Global Innovation and Development, East China Normal University, Shanghai, 200241, China.
| |
Collapse
|
11
|
Fung PL, Savadkoohi M, Zaidan MA, Niemi JV, Timonen H, Pandolfi M, Alastuey A, Querol X, Hussein T, Petäjä T. Constructing transferable and interpretable machine learning models for black carbon concentrations. ENVIRONMENT INTERNATIONAL 2024; 184:108449. [PMID: 38286044 DOI: 10.1016/j.envint.2024.108449] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 01/12/2024] [Accepted: 01/17/2024] [Indexed: 01/31/2024]
Abstract
Black carbon (BC) has received increasing attention from researchers due to its adverse health effects. However, in-situ BC measurements are often not included as a regulated variable in air quality monitoring networks. Machine learning (ML) models have been studied extensively to serve as virtual sensors to complement the reference instruments. This study evaluates and compares three white-box (WB) and four black-box (BB) ML models to estimate BC concentrations, with the focus to show their transferability and interpretability. We train the models with the long-term air pollutant and weather measurements in Barcelona urban background site, and test them in other European urban and traffic sites. Despite the difference in geographical locations and measurement sites, BC correlates the strongest with particle number concentration of accumulation mode (PNacc, r = 0.73-0.85) and nitrogen dioxide (NO2, r = 0.68-0.85) and the weakest with meteorological parameters. Due to its similarity of correlation behaviour, the ML models trained in Barcelona performs prominently at the traffic site in Helsinki (R2 = 0.80-0.86; mean absolute error MAE = 3.90-4.73 %) and at the urban background site in Dresden (R2 = 0.79-0.84; MAE = 4.23-4.82 %). WB models appear to explain less variability of BC than BB models, long short-term memory (LSTM) model of which outperforms the rest of the models. In terms of interpretability, we adopt several methods for individual model to quantify and normalize the relative importance of each input feature. The overall static relative importance commonly used for WB models demonstrate varying results from the dynamic values utilized to show local contribution used for BB models. PNacc and NO2 on average have the strongest absolute static contribution; however, they simultaneously impact the estimation positively and negatively at different sites. This comprehensive analysis demonstrates that the possibility of these interpretable air pollutant ML models to be transfered across space and time.
Collapse
Affiliation(s)
- Pak Lun Fung
- Institute for Atmospheric and Earth System Research / Physics, Faculty of Science, University of Helsinki, Helsinki FI-00560, Finland; Helsinki Institute of Sustainability Science, Faculty of Science, University of Helsinki, Helsinki FI-00560, Finland.
| | - Marjan Savadkoohi
- Institute of Environmental Assessment and Water Research (IDAEA-CSIC), Barcelona, Spain; Department of Mining, Industrial and ICT Engineering (EMIT), Manresa School of Engineering (EPSEM), Universitat Politècnica de Catalunya (UPC), Manresa 08242, Spain.
| | - Martha Arbayani Zaidan
- Institute for Atmospheric and Earth System Research / Physics, Faculty of Science, University of Helsinki, Helsinki FI-00560, Finland; Helsinki Institute of Sustainability Science, Faculty of Science, University of Helsinki, Helsinki FI-00560, Finland; Department of Computer Science, Faculty of Science, University of Helsinki, Helsinki FI-00560, Finland.
| | - Jarkko V Niemi
- Helsinki Region Environmental Services Authority (HSY), Helsinki FI-00066, Finland.
| | - Hilkka Timonen
- Atmospheric Composition Research, Finnish Meteorological Institute, Helsinki FI-00560, Finland.
| | - Marco Pandolfi
- Institute of Environmental Assessment and Water Research (IDAEA-CSIC), Barcelona, Spain.
| | - Andrés Alastuey
- Institute of Environmental Assessment and Water Research (IDAEA-CSIC), Barcelona, Spain.
| | - Xavier Querol
- Institute of Environmental Assessment and Water Research (IDAEA-CSIC), Barcelona, Spain.
| | - Tareq Hussein
- Institute for Atmospheric and Earth System Research / Physics, Faculty of Science, University of Helsinki, Helsinki FI-00560, Finland; Environmental and Atmospheric Research Laboratory (EARL), Department of Physics, School of Science, Amman 11942, Jordan.
| | - Tuukka Petäjä
- Institute for Atmospheric and Earth System Research / Physics, Faculty of Science, University of Helsinki, Helsinki FI-00560, Finland.
| |
Collapse
|
12
|
Searcy RT, Boehm AB. Know Before You Go: Data-Driven Beach Water Quality Forecasting. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2023; 57:17930-17939. [PMID: 36472482 DOI: 10.1021/acs.est.2c05972] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
Forecasting environmental hazards is critical in preventing or building resilience to their impacts on human communities and ecosystems. Environmental data science is an emerging field that can be harnessed for forecasting, yet more work is needed to develop methodologies that can leverage increasingly large and complex data sets for decision support. Here, we design a data-driven framework that can, for the first time, forecast bacterial standard exceedances at marine beaches with 3 days lead time. Using historical data sets collected at two California sites, we train nearly 400 forecast models using statistical and machine learning techniques and test forecasts against predictions from both a naive "persistence" model and a baseline nowcast model. Overall, forecast models are found to have similar sensitivities and specificities to the persistence model, but significantly higher areas under the ROC curve (a metric distinguishing a model's ability to effectively parse classes across decision thresholds), suggesting that forecasts can provide enhanced information beyond past observations alone. Forecast model performance at all lead times was similar to that of nowcast models. Together, results suggest that integrating the forecasting framework developed in this study into beach management programs can enable better public notification and aid in proactive pollution and health risk management.
Collapse
Affiliation(s)
- Ryan T Searcy
- Department of Civil & Environmental Engineering, Stanford University, 473 Via Ortega, Stanford, California 94305, United States
| | - Alexandria B Boehm
- Department of Civil & Environmental Engineering, Stanford University, 473 Via Ortega, Stanford, California 94305, United States
| |
Collapse
|
13
|
Wen C, Lin X, Ying Y, Ma Y, Yu H, Li X, Yan J. Dioxin emission prediction from a full-scale municipal solid waste incinerator: Deep learning model in time-series input. WASTE MANAGEMENT (NEW YORK, N.Y.) 2023; 170:93-102. [PMID: 37562201 DOI: 10.1016/j.wasman.2023.08.004] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Revised: 07/02/2023] [Accepted: 08/04/2023] [Indexed: 08/12/2023]
Abstract
The immeasurability of real-time dioxin emissions is the principal limitation to controlling and reducing dioxin emissions in municipal solid waste incineration (MSWI). Existing methods for dioxin emissions prediction are based on machine learning with inadequate dioxin datasets. In this study, the deep learning models are trained through larger online dioxin emissions data from a waste incinerator to predict real-time dioxin emissions. First, data are collected and the operating data are preprocessed. Then, the dioxin emission prediction performance of the machine learning and deep learning models, including long short-term memory (LSTM) and convolutional neural networks (CNN), with normal input and time-series input are compared. We evaluate the applicability of each model and find that the performance of the deep learning models (LSTM and CNN) has improved by 36.5% and 30.4%, respectively, in terms of the mean square error (MSE) with the time-series input. Moreover, through feature analysis, we find that temperature, airflow, and time dimension are considerable for dioxin prediction. The results are meaningful for optimizing the control of dioxins from MSWI.
Collapse
Affiliation(s)
- Chaojun Wen
- Polytechnic Institute, Zhejiang University, Hangzhou 310027, China
| | - Xiaoqing Lin
- Polytechnic Institute, Zhejiang University, Hangzhou 310027, China; State Key Laboratory of Clean Energy Utilization, Institute for Thermal Power Engineering, Zhejiang University, Hangzhou 310027, China.
| | - Yuxuan Ying
- State Key Laboratory of Clean Energy Utilization, Institute for Thermal Power Engineering, Zhejiang University, Hangzhou 310027, China
| | - Yunfeng Ma
- State Key Laboratory of Clean Energy Utilization, Institute for Thermal Power Engineering, Zhejiang University, Hangzhou 310027, China
| | - Hong Yu
- State Key Laboratory of Clean Energy Utilization, Institute for Thermal Power Engineering, Zhejiang University, Hangzhou 310027, China
| | - Xiaodong Li
- State Key Laboratory of Clean Energy Utilization, Institute for Thermal Power Engineering, Zhejiang University, Hangzhou 310027, China; Key Laboratory of Clean Energy and Carbon Neutrality of Zhejiang Province, Jiaxing Research Institute, Zhejiang University, Jiaxing 314031, China
| | - Jianhua Yan
- State Key Laboratory of Clean Energy Utilization, Institute for Thermal Power Engineering, Zhejiang University, Hangzhou 310027, China
| |
Collapse
|
14
|
Villoria Hernandez P, Mariñas-Collado I, Garcia Sipols A, Simon de Blas C, Rodriguez Sánchez MC. Time series forecasting methods in emergency contexts. Sci Rep 2023; 13:16141. [PMID: 37752198 PMCID: PMC10522600 DOI: 10.1038/s41598-023-42917-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Accepted: 09/16/2023] [Indexed: 09/28/2023] Open
Abstract
The key issues in any fire emergency are recognising fire hotspots, locating the emergency intervention team (EI), following the evolution of the fire, and selecting the evacuation path. This leads to the study and development of HelpResponder, a solution capable of detecting the focus of interest in hostile spaces derived from fire due to high temperatures without visibility. A study is conducted to determine which model best predicts measured [Formula: see text] levels. The variables used are temperature, humidity, and air quality, obtained from sensors installed in a fire tower. The statistical methods applied, namely ARIMAX, KNN, SVM, and TBATS, allow the adjustment and modelling of the variables. Explanatory variables with temporal structure are incorporated into SVM, a new improvement proposal. Moreover, combining different models showed the best efficiency in forecasting. In fact, another contribution of our work lies in offering a small-scale prediction system that is specifically designed to save batteries. The system has been tested and validated in a hostile environment (building), simulating real emergency situations. The system has been tested and validated in several hostile environments, simulating real emergency situations. It can help firefighters respond faster in an emergency. This reduces the risks associated with the lack of information and improves the time for tactical operations, which could save lives.
Collapse
Affiliation(s)
| | - I Mariñas-Collado
- Department of Statistics and Operations Research and Mathematics Didactics, University of Oviedo, Oviedo, Spain.
| | - A Garcia Sipols
- Department of Applied Mathematics, Materials Science and Engineering and Electronic Technology, Rey Juan Carlos University, Madrid, Spain
| | - C Simon de Blas
- Department of Computer Sciences and Statistics, Rey Juan Carlos University, Madrid, Spain
| | | |
Collapse
|
15
|
Panneerselvam V, Thiagarajan R. ACBiGRU-DAO: Attention Convolutional Bidirectional Gated Recurrent Unit-based Dynamic Arithmetic Optimization for Air Quality Prediction. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2023; 30:86804-86820. [PMID: 37410321 DOI: 10.1007/s11356-023-28028-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Accepted: 05/28/2023] [Indexed: 07/07/2023]
Abstract
Over the past decades, air pollution has turned out to be a major cause of environmental degradation and health effects, particularly in developing countries like India. Various measures are taken by scholars and governments to control or mitigate air pollution. The air quality prediction model triggers an alarm when the quality of air changes to hazardous or when the pollutant concentration surpasses the defined limit. Accurate air quality assessment becomes an indispensable step in many urban and industrial areas to monitor and preserve the quality of air. To accomplish this goal, this paper proposes a novel Attention Convolutional Bidirectional Gated Recurrent Unit based Dynamic Arithmetic Optimization (ACBiGRU-DAO) approach. The Attention Convolutional Bidirectional Gated Recurrent Unit (ACBiGRU) model is determined in which the fine-tuning parameters are used to enhance the proposed method by Dynamic Arithmetic Optimization (DAO) algorithm. The air quality data of India was acquired from the Kaggle website. From the dataset, the most-influencing features such as Air Quality Index (AQI), particulate matter namely PM2.5 and PM10, carbon monoxide (CO) concentration, nitrogen dioxide (NO2) concentration, sulfur dioxide (SO2) concentration, and ozone (O3) concentration are taken as input data. Initially, they are preprocessed through two different pipelines namely imputation of missing values and data transformation. Finally, the proposed ACBiGRU-DAO approach predicts air quality and classifies based on their severities into six AQI stages. The efficiency of the proposed ACBiGRU-DAO approach is examined using diverse evaluation indicators namely Accuracy, Maximum Prediction Error (MPE), Mean Absolute Error (MAE), Mean Square Error (MSE), Root Mean Square Error (RMSE), and Correlation Coefficient (CC). The simulation result inherits that the proposed ACBiGRU-DAO approach achieves a greater percentage of accuracy of about 95.34% than other compared methods.
Collapse
Affiliation(s)
- Vinoth Panneerselvam
- Department of Computer Science and Engineering, Mepco Schlenk Engineering College, Sivakasi, India.
| | - Revathi Thiagarajan
- Department of Information Technology, Mepco Schlenk Engineering College, Sivakasi, India
| |
Collapse
|
16
|
Agarwal A, Sahu M. Forecasting PM 2.5 concentrations using statistical modeling for Bengaluru and Delhi regions. ENVIRONMENTAL MONITORING AND ASSESSMENT 2023; 195:502. [PMID: 36949261 DOI: 10.1007/s10661-023-11045-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/04/2022] [Accepted: 02/21/2023] [Indexed: 06/18/2023]
Abstract
India is home to some of the most polluted cities on the planet. The worsening air quality in most of the cities has gone to an extent of causing severe impact on human health and life expectancy. An early warning system where people are alerted well before an adverse air quality episode can go a long way in preventing exposure to harmful air conditions. Having such system can also help the government to take better mitigation and preventive measures. Forecasting systems based on machine learning are gaining importance due to their cost-effectiveness and applicability to small towns and villages, where most complex models are not feasible due to resource constraints and limited data availability. This paper presents a study of air quality forecasting by application of statistical models. Three statistical models based on autoregression (AR), moving average (MA), and autoregressive integrated moving average (ARIMA) models were applied to the datasets of PM2.5 concentrations of Delhi and Bengaluru, and forecasting was done for 1-day-ahead and 7-day-ahead time frames. All three models forecasted the PM2.5 reasonably well for Bengaluru, but the model performance deteriorated for the Delhi region. The AR, MA, and ARIMA models achieved mean absolute percentage error (MAPE) of 10.82%, 7.94%, and 8.17% respectively for forecast of 7 days and MAPE of 7.35%, 5.62%, and 5.87% for 1-day-ahead forecasts for Bengaluru. For the Delhi region, the model gave an MAPE of 27.82%, 24.62%, and 27.32% for the AR, MA, and ARIMA models respectively in the 7-day-ahead forecast, and 24.48%, 23.53%, and 23.72% respectively for 1-day-ahead forecast. The analysis showed that ARIMA model performs better in comparison to the other models but performance varies with varying concentration regimes. Study indicates that other topographical and meteorological parameters need to be incorporated to develop better models and account for the effects of these parameters in the study.
Collapse
Affiliation(s)
- Akash Agarwal
- Aerosol and Nanoparticle Technology Laboratory, Environmental Science and Engineering Department, Indian Institute of Technology Bombay, Powai, Mumbai, India, 400076
| | - Manoranjan Sahu
- Aerosol and Nanoparticle Technology Laboratory, Environmental Science and Engineering Department, Indian Institute of Technology Bombay, Powai, Mumbai, India, 400076.
- Interdisciplinary Program in Climate Studies, Indian Institute of Technology Bombay, Mumbai, 400076, India.
- Centre for Machine Intelligence and Data Science, Indian Institute of Technology Bombay, Mumbai, 400076, India.
| |
Collapse
|
17
|
Msakni MK, Risan A, Schütz P. Using machine learning prediction models for quality control: a case study from the automotive industry. COMPUTATIONAL MANAGEMENT SCIENCE 2023; 20:14. [PMID: 36942085 PMCID: PMC10019438 DOI: 10.1007/s10287-023-00448-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Accepted: 03/01/2023] [Indexed: 06/18/2023]
Abstract
This paper studies a prediction problem using time series data and machine learning algorithms. The case study is related to the quality control of bumper beams in the automotive industry. These parts are milled during the production process, and the locations of the milled holes are subject to strict tolerance limits. Machine learning models are used to predict the location of milled holes in the next beam. By doing so, tolerance violations are detected at an early stage, and the production flow can be improved. A standard neural network, a long short term memory network (LSTM), and random forest algorithms are implemented and trained with historical data, including a time series of previous product measurements. Experiments indicate that all models have similar predictive capabilities with a slight dominance for the LSTM and random forest. The results show that some holes can be predicted with good quality, and the predictions can be used to improve the quality control process. However, other holes show poor results and support the claim that real data problems are challenged by inappropriate information or a lack of relevant information.
Collapse
Affiliation(s)
- Mohamed Kais Msakni
- Department of Industrial Economics and Technology Management, Norwegian University of Science and Technology, Torgarden, 7491 Trondheim, Norway
| | - Anders Risan
- Department of Industrial Economics and Technology Management, Norwegian University of Science and Technology, Torgarden, 7491 Trondheim, Norway
| | - Peter Schütz
- Department of Industrial Economics and Technology Management, Norwegian University of Science and Technology, Torgarden, 7491 Trondheim, Norway
| |
Collapse
|
18
|
Ayus I, Natarajan N, Gupta D. Comparison of machine learning and deep learning techniques for the prediction of air pollution: a case study from China. ASIAN JOURNAL OF ATMOSPHERIC ENVIRONMENT 2023; 17:4. [PMCID: PMC10214349 DOI: 10.1007/s44273-023-00005-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 05/12/2023] [Indexed: 09/07/2023]
Abstract
The adverse effect of air pollution has always been a problem for human health. The presence of a high level of air pollutants can cause severe illnesses such as emphysema, chronic obstructive pulmonary disease (COPD), or asthma. Air quality prediction helps us to undertake practical action plans for controlling air pollution. The Air Quality Index (AQI) reflects the degree of concentration of pollutants in a locality. The average AQI was calculated for the various cities in China to understand the annual trends. Furthermore, the air quality index has been predicted for ten major cities across China using five different deep learning techniques, namely, Recurrent Neural Network (RNN), Bidirectional Gated Recurrent unit (Bi-GRU), Bidirectional Long Short-Term Memory (BiLSTM), Convolutional Neural Network BiLSTM (CNN-BiLSTM), and Convolutional BiLSTM (Conv1D-BiLSTM). The performance of these models has been compared with a machine learning model, eXtreme Gradient Boosting (XGBoost) to discover the most efficient deep learning model. The results suggest that the machine learning model, XGBoost, outperforms the deep learning models. While Conv1D-BiLSTM and CNN-BiLSTM perform well among the deep learning models in the estimation of the air quality index (AQI), RNN and Bi-GRU are the least performing ones. Thus, both XGBoost and neural network models are capable of capturing the non-linearity present in the dataset with reliable accuracy.
Collapse
Affiliation(s)
- Ishan Ayus
- Department of Computer Science and Engineering, ITER, Siksha ‘O’ Anusandhan University, Bhubaneswar, Odisha India
| | - Narayanan Natarajan
- Department of Civil Engineering, Dr. Mahalingam College of Engineering and Technology, Tamil Nadu, Pollachi, 642003 India
| | - Deepak Gupta
- Department of Computer Science & Engineering, MNNIT Allahabad, Prayagraj, 211004 India
| |
Collapse
|
19
|
Ko K, Cho S, Rao RR. Machine-Learning-Based Near-Surface Ozone Forecasting Model with Planetary Boundary Layer Information. SENSORS (BASEL, SWITZERLAND) 2022; 22:7864. [PMID: 36298214 PMCID: PMC9610675 DOI: 10.3390/s22207864] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Revised: 10/09/2022] [Accepted: 10/10/2022] [Indexed: 06/16/2023]
Abstract
Surface ozone is one of six air pollutants designated as harmful by National Ambient Air Quality Standards because it can adversely impact human health and the environment. Thus, ozone forecasting is a critical task that can help people avoid dangerously high ozone concentrations. Conventional numerical approaches, as well as data-driven forecasting approaches, have been studied for ozone forecasting. Data-driven forecasting models, in particular, have gained momentum with the introduction of machine learning advancements. We consider planetary boundary layer (PBL) height as a new input feature for data-driven ozone forecasting models. PBL has been shown to impact ozone concentrations, making it an important factor in ozone forecasts. In this paper, we investigate the effectiveness of utilization of PBL height on the performance of surface ozone forecasts. We present both surface ozone forecasting models, based on multilayer perceptron (MLP) and bidirectional long short-term memory (LSTM) models. These two models forecast hourly ozone concentrations for an upcoming 24-h period using two types of input data, such as measurement data and PBL height. We consider the predicted values of PBL height obtained from the weather research and forecasting (WRF) model, since it is difficult to gather actual PBL measurements. We evaluate two ozone forecasting models in terms of index of agreement (IOA), mean absolute error (MAE), and root mean square error (RMSE). Results showed that the MLP-based and bidirectional LSTM-based models yielded lower MAE and RMSE when considering forecasted PBL height, but there was no significant changes in IOA when compared with models in which no forecasted PBL data were used. This result suggests that utilizing forecasted PBL height can improve the forecasting performance of data-driven prediction models for surface ozone concentrations.
Collapse
Affiliation(s)
- Kabseok Ko
- Department of Electronics Engineering, Kangwon National University, Chuncheon 24341, Korea
| | - Seokheon Cho
- Qualcomm Institute, University of California, San Diego (UCSD), San Diego, CA 92093, USA
| | - Ramesh R. Rao
- Qualcomm Institute, University of California, San Diego (UCSD), San Diego, CA 92093, USA
| |
Collapse
|
20
|
Aksangür İ, Eren B, Erden C. Evaluation of data preprocessing and feature selection process for prediction of hourly PM 10 concentration using long short-term memory models. ENVIRONMENTAL POLLUTION (BARKING, ESSEX : 1987) 2022; 311:119973. [PMID: 35985430 DOI: 10.1016/j.envpol.2022.119973] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Revised: 08/05/2022] [Accepted: 08/11/2022] [Indexed: 06/15/2023]
Abstract
Studies have confirmed that PM10, defined as respirable particles with diameters of 10 μm and smaller, has adverse effects on human health and the environment. Various estimation methods are employed to determine the PM10 concentration using historical data on controlling PM10 air pollution, early warning, and protecting public health and the environment. The present study analyses different Long Short-Term Memory (LSTM) models that can predict hourly PM10 concentration. In parallel, the study also investigates the effectiveness of the data preprocessing and feature selection (DPFS) process on the prediction accuracy of the LSTM models. For this purpose, three different LSTM models, namely Vanilla, Bi-Directional, and Stacked, were developed. Then, a comprehensive data preprocessing stage is used to eliminate missing and erroneous data and outliers from real-world raw data, and a feature selection process is applied to extract unnecessary features. The LSTM models consider three air quality parameters, including SO2, O3, and CO, and three meteorological factors, including relative humidity, wind direction, and wind speed. The prediction performances of the LSTM models are compared using the RMSE, MAE and R2 performance index according to whether DPFS is used in the models or not. As a result, when the DPFS process was applied, the proposed LSTM models achieved high prediction performance and can be used to predict hourly PM10 concentrations. Overall, the DPFS process significantly enhanced the developed LSTM models' prediction performance. Furthermore, the proposed model might be a useful tool for city administrators to make decisions and improve air quality management efforts.
Collapse
Affiliation(s)
- İpek Aksangür
- Department of Environmental Engineering, Faculty of Engineering, Sakarya University, Esentepe, Sakarya, Turkey
| | - Beytullah Eren
- Department of Environmental Engineering, Faculty of Engineering, Sakarya University, Esentepe, Sakarya, Turkey; Halfeti Vocational School, Harran University, Halfeti, Şanlıurfa, Turkey.
| | - Caner Erden
- Department of International Trade and Finance, Faculty of Applied Science, Sakarya University of Applied Science, Sakarya, Turkey; AI Research and Application Center, Sakarya University of Applied Sciences, Sakarya, Turkey
| |
Collapse
|
21
|
Intelligent Forecasting of Air Quality and Pollution Prediction Using Machine Learning. ADSORPT SCI TECHNOL 2022. [DOI: 10.1155/2022/5086622] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Air pollution consists of harmful gases and fine Particulate Matter (PM2.5) which affect the quality of air. This has not only become the key issues in scientific research but also turned to be an important social issues of the public’s life. Therefore, many experts and scholars at different R&Ds, universities, and abroad are involved in lot of research on PM2.5 pollutant predictions. In this scenario, the authors proposed various machine learning models such as linear regression, random forest, KNN, ridge and lasso, XGBoost, and AdaBoost models to predict PM2.5 pollutants in polluted cities. This experiment is carried out using Jupyter Notebook in Python 3.7.3. From the results with respect to MAE, MAPE, and RMSE metrics, among the models, XGBoost, AdaBoost, random forest, and KNN models (8.27, 0.40, and 13.85; 9.23, 0.45, and 10.59; 39.84, 1.94, and 54.59; and 49.13, 2.40, and 69.92, respectively) are observed to be more reliable models. The PM2.5 pollutant concentration (PClow-PChigh) range observed for these models is 0-18.583 μg/m3, 18.583-25.023 μg/m3, 25.023-28.234μg/m3, and 28.234-49.032 μg/m3, respectively, so these models can both predict the PM2.5 pollutant and can forecast the air quality levels in a better way. On comparison between various existing models and proposed models, it was observed that the proposed models can predict the PM2.5 pollutant with a better performance with a reduced error rate than the existing models.
Collapse
|
22
|
Environmental Pollution Analysis and Impact Study-A Case Study for the Salton Sea in California. ATMOSPHERE 2022. [DOI: 10.3390/atmos13060914] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
A natural experiment conducted on the shrinking Salton Sea, a saline lake in California, showed that each one foot drop in lake elevation resulted in a 2.6% average increase in PM2.5 concentrations. The shrinking has caused the asthma rate continues to increase among children, with one in five children being sent to the emergency department, which is related to asthma. In this paper, several data-driven machine learning (ML) models are developed for forecasting air quality and dust emission to study, evaluate and predict the impacts on human health due to the shrinkage of the sea, such as the Salton Sea. The paper presents an improved long short-term memory (LSTM) model to predict the hourly air quality (O3 and CO) based on air pollutants and weather data in the previous 5 h. According to our experiment results, the model generates a very good R2 score of 0.924 and 0.835 for O3 and CO, respectively. In addition, the paper proposes an ensemble model based on random forest (RF) and gradient boosting (GBoost) algorithms for forecasting hourly PM2.5 and PM10 using the air quality and weather data in the previous 5 h. Furthermore, the paper shares our research results for PM2.5 and PM10 prediction based on the proposed ensemble ML models using satellite remote sensing data. Daily PM2.5 and PM10 concentration maps in 2018 are created to display the regional air pollution density and severity. Finally, the paper reports Artificial Intelligence (AI) based research findings of measuring air pollution impact on asthma prevalence rate of local residents in the Salton Sea region. A stacked ensemble model based on support vector regression (SVR), elastic net regression (ENR), RF and GBoost is developed for asthma prediction with a good R2 score of 0.978.
Collapse
|
23
|
Latifoğlu L. A novel combined model for prediction of daily precipitation data using instantaneous frequency feature and bidirectional long short time memory networks. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2022; 29:42899-42912. [PMID: 35092586 DOI: 10.1007/s11356-022-18874-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/20/2021] [Accepted: 01/21/2022] [Indexed: 06/14/2023]
Abstract
Meteorological events constantly affect human life, especially the occurrence of excessive precipitation in a short time causes important events such as floods. However, in case of insufficient precipitation for a long time, drought occurs. In recent years, significant changes in precipitation regimes have been observed and these changes cause socio-economic and ecological problems. Therefore, it is of great importance to correctly predict and analyze the precipitation data. In this study, a reliable and accurate precipitation forecasting model is proposed. For this aim, three deep neural network models, long short-time memory networks (LSTM), gated recurrent unit (GRU), and bidirectional long short time memory networks (biLSTM), were applied for one ahead forecasting of daily precipitation data and compared the performances of these models. Moreover, to increase the far ahead forecasting performance of the biLSTM model, the instantaneous frequency (IF) feature was applied as the input parameter for the first time in the literature. Therefore, a novel model ensemble of IF and biLSTM was employed for the aim of one-six ahead forecasting of daily precipitation data. The performance of the proposed IF-biLSTM model was evaluated using mean absolute error (MAE), mean square error (MSE), correlation coefficient (R), and determination coefficient (R2) performance parameter and spider charts were used to assess the model performances. According to the numerical results, the biLSTM model outperformed compared with the LSTM and GRU models. After the good score achieved with biLSTM model, IF feature applied to biLSTM and IF-biLSTM model has the best forecasting performance for daily precipitation data with R2 value 0.9983, 0.9827, 0.9092, 0.8508, 0.7827, and 0.7646, respectively, for one-six ahead forecasting of daily precipitation data. It has been observed that the IF-biLSTM model has higher forecasting performance than the biLSTM model, especially in far ahead forecasting studies, and the IF feature improves the estimation performance.
Collapse
Affiliation(s)
- Levent Latifoğlu
- Department of Civil Engineering, Faculty of Engineering, Erciyes University, Kayseri, Turkey.
| |
Collapse
|
24
|
Asha P, Natrayan L, Geetha BT, Beulah JR, Sumathy R, Varalakshmi G, Neelakandan S. IoT enabled environmental toxicology for air pollution monitoring using AI techniques. ENVIRONMENTAL RESEARCH 2022; 205:112574. [PMID: 34919959 DOI: 10.1016/j.envres.2021.112574] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Revised: 11/09/2021] [Accepted: 12/12/2021] [Indexed: 06/14/2023]
Abstract
In past decades, the industrial and technological developments have increased exponentially and accompanied by non-judicial and un-sustainable utilization of non-renewable resources. At the same time, the environmental branch of toxicology has gained significant attention in understanding the effect of toxic chemicals on human health. Environmental toxic agents cause several diseases, particularly high risk among children, pregnant women, geriatrics and clinical patients. Since air pollution affects human health and results in increased morbidity and mortality increased the toxicological studies focusing on industrial air pollution absorbed by the common people. Therefore, it is needed to design an automated Environmental Toxicology based Air Pollution Monitoring System. To resolve the limitations of traditional monitoring system and to reduce the overall cost, this paper designs an IoT enabled Environmental Toxicology for Air Pollution Monitoring using Artificial Intelligence technique (ETAPM-AIT) to improve human health. The proposed ETAPM-AIT model includes a set of IoT based sensor array to sense eight pollutants namely NH3, CO, NO2, CH4, CO2, PM2.5, temperature and humidity. The sensor array measures the pollutant level and transmits it to the cloud server via gateways for analytic process. The proposed model aims to report the status of air quality in real time by using cloud server and sends an alarm in the presence of hazardous pollutants level in the air. For the classification of air pollutants and determining air quality, Artificial Algae Algorithm (AAA) based Elman Neural Network (ENN) model is used as a classifier, which predicts the air quality in the forthcoming time stamps. The AAA is applied as a parameter tuning technique to optimally determine the parameter values of the ENN model. In-order to examine the air quality monitoring performance of the proposed ETAPM-AIT model, an extensive set of simulation analysis is performed and the results are inspected in 5, 15, 30 and 60 min of duration respectively. The experimental outcome highlights the optimal performance of the proposed ETAPM-AIT model over the recent techniques.
Collapse
Affiliation(s)
- P Asha
- Department of Computer Science and Engineering, Sathyabama Institute of Science and Technology, India
| | - L Natrayan
- Department of Mechanical Engineering, Saveetha School of Engineering, SIMATS, India
| | - B T Geetha
- Department of ECE, Saveetha School of Engineering, SIMATS, Saveetha University, India
| | - J Rene Beulah
- Department of CSE, College of Engineering and Technology, Faculty of Engineering and Technology, SRM Institute of Science and Technology, SRM Nagar, Kattankulathur, India
| | - R Sumathy
- Department of Computer Science and Engineering, Vel Tech Rangarajan Dr. Sagunthala R&D Institute of Science and Technology, India
| | - G Varalakshmi
- Lecturer in Computer Science, Telangana Social Welfare Degree College for Womens, Siddipet, India
| | - S Neelakandan
- Department of CSE, R.M.K Engineering College, India.
| |
Collapse
|
25
|
Oh Y, Min S. Practical Application Using the Clustering Algorithm. ARTIF INTELL 2022. [DOI: 10.5772/intechopen.99314] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
This chapter will survey the clustering algorithm that is unsupervised learning among data mining and machine learning techniques. The most popular clustering algorithm is the K-means clustering algorithm; It can represent a cluster of data. The K-means clustering algorithm is an essential factor in finding an appropriate K value for distributing the training dataset. It is common to find this value experimentally. Also, it can use the elbow method, which is a heuristic approach used in determining the number of clusters. One of the present clusterings applied studies is the particulate matter concentration clustering algorithm for particulate matter distribution estimation. This algorithm divides the area of the center that the fine dust distribution using K-means clustering. It then finds the coordinates of the optimal point according to the distribution of the particulate matter values. The training dataset is the latitude, longitude of the observatory, and PM10 value obtained from the AirKorea website provided by the Korea Environment Corporation. This study performed the K-means clustering algorithm to cluster feature datasets. Furthermore, it showed an experiment on the K values to represent the cluster better. It performed clustering by changing K values from 10 to 23. Then it generated 16 labels divided into 16 cities in Korea and compared them to the clustering result. Visualizing them on the actual map confirmed whether the clusters of each city were evenly bound. Moreover, it figures out the cluster center to find the observatory location representing particulate matter distribution.
Collapse
|
26
|
Fan K, Dhammapala R, Harrington K, Lamastro R, Lamb B, Lee Y. Development of a Machine Learning Approach for Local-Scale Ozone Forecasting: Application to Kennewick, WA. Front Big Data 2022; 5:781309. [PMID: 35237751 PMCID: PMC8883518 DOI: 10.3389/fdata.2022.781309] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Accepted: 01/19/2022] [Indexed: 11/13/2022] Open
Abstract
Chemical transport models (CTMs) are widely used for air quality forecasts, but these models require large computational resources and often suffer from a systematic bias that leads to missed poor air pollution events. For example, a CTM-based operational forecasting system for air quality over the Pacific Northwest, called AIRPACT, uses over 100 processors for several hours to provide 48-h forecasts daily, but struggles to capture unhealthy O3 episodes during the summer and early fall, especially over Kennewick, WA. This research developed machine learning (ML) based O3 forecasts for Kennewick, WA to demonstrate an improved forecast capability. We used the 2017–2020 simulated meteorology and O3 observation data from Kennewick as training datasets. The meteorology datasets are from the Weather Research and Forecasting (WRF) meteorological model forecasts produced daily by the University of Washington. Our ozone forecasting system consists of two ML models, ML1 and ML2, to improve predictability: ML1 uses the random forest (RF) classifier and multiple linear regression (MLR) models, and ML2 uses a two-phase RF regression model with best-fit weighting factors. To avoid overfitting, we evaluate the ML forecasting system with the 10-time, 10-fold, and walk-forward cross-validation analysis. Compared to AIRPACT, ML1 improved forecast skill for high-O3 events and captured 5 out of 10 unhealthy O3 events, while AIRPACT and ML2 missed all the unhealthy events. ML2 showed better forecast skill for less elevated-O3 events. Based on this result, we set up our ML modeling framework to use ML1 for high-O3 events and ML2 for less elevated O3 events. Since May 2019, the ML modeling framework has been used to produce daily 72-h O3 forecasts and has provided forecasts via the web for clean air agency and public use: http://ozonematters.com/. Compared to the testing period, the operational forecasting period has not had unhealthy O3 events. Nevertheless, the ML modeling framework demonstrated a reliable forecasting capability at a selected location with much less computational resources. The ML system uses a single processor for minutes compared to the CTM-based forecasting system using more than 100 processors for hours.
Collapse
Affiliation(s)
- Kai Fan
- Center for Advanced Systems Understanding, Görlitz, Germany
- Helmholtz-Zentrum Dresden Rossendorf, Dresden, Germany
- Laboratory for Atmospheric Research, Department of Civil and Environmental Engineering, Washington State University, Pullman, WA, United States
| | - Ranil Dhammapala
- Washington State Department of Ecology, Olympia, WA, United States
| | | | - Ryan Lamastro
- Environmental Geochemical Science, School of Science and Engineering, State University of New York, New Paltz, NY, United States
| | - Brian Lamb
- Laboratory for Atmospheric Research, Department of Civil and Environmental Engineering, Washington State University, Pullman, WA, United States
| | - Yunha Lee
- Center for Advanced Systems Understanding, Görlitz, Germany
- Helmholtz-Zentrum Dresden Rossendorf, Dresden, Germany
- Laboratory for Atmospheric Research, Department of Civil and Environmental Engineering, Washington State University, Pullman, WA, United States
- *Correspondence: Yunha Lee
| |
Collapse
|
27
|
Machine Learning to Predict Area Fugitive Emission Fluxes of GHGs from Open-Pit Mines. ATMOSPHERE 2022. [DOI: 10.3390/atmos13020210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Greenhouse gas (GHG) emissions from open-pit mines pose a global climate challenge, which necessitates appropriate quantification to support effective mitigation measures. This study considers the area-fugitive methane advective flux (as a proxy for emission flux) released from a tailings pond and two open-pit mines, denominated “old” and “new”, within a facility in northern Canada. To estimate the emission fluxes of methane from these sources, this research employed near-surface observations and modeling using the weather research and forecasting (WRF) passive tracer dispersion method. Various machine learning (ML) methods were trained and tested on these data for the operational forecasting of emissions. Predicted emission fluxes and meteorological variables from the WRF model were used as training and input datasets for ML algorithms. A series of 10 ML algorithms were evaluated. The four models that generated the most accurate forecasts were selected. These ML models are the multi-layer perception (MLP) artificial neural network, the gradient boosting (GBR), XGBOOST (XGB), and support vector machines (SVM). Overall, the simulations predicted the emission fluxes with R2 (-) values higher than 0.8 (-). Considering the bias (Tonnes h−1), the ML predicted the emission fluxes within 6.3%, 3.3%, and 0.3% of WRF predictions for the old mine, new mine, and the pond, respectively.
Collapse
|
28
|
Almazroi AA. Survival prediction among heart patients using machine learning techniques. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2022; 19:134-145. [PMID: 34902984 DOI: 10.3934/mbe.2022007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Cardiovascular diseases are regarded as the most common reason for worldwide deaths. As per World Health Organization, nearly 17.9 million people die of heart-related diseases each year. The high shares of cardiovascular-related diseases in total worldwide deaths motivated researchers to focus on ways to reduce the numbers. In this regard, several works focused on the development of machine learning techniques/algorithms for early detection, diagnosis, and subsequent treatment of cardiovascular-related diseases. These works focused on a variety of issues such as finding important features to effectively predict the occurrence of heart-related diseases to calculate the survival probability. This research contributes to the body of literature by selecting a standard well defined, and well-curated dataset as well as a set of standard benchmark algorithms to independently verify their performance based on a set of different performance evaluation metrics. From our experimental evaluation, it was observed that decision tree is the best performing algorithm in comparison to logistic regression, support vector machines, and artificial neural networks. Decision trees achieved 14% better accuracy than the average performance of the remaining techniques. In contrast to other studies, this research observed that artificial neural networks are not as competitive as the decision tree or support vector machine.
Collapse
Affiliation(s)
- Abdulwahab Ali Almazroi
- University of Jeddah, College of Computing and Information Technology at Khulais, Department of Information Technology, Jeddah, Saudi Arabia
| |
Collapse
|
29
|
Zaini N, Ean LW, Ahmed AN, Malek MA. A systematic literature review of deep learning neural network for time series air quality forecasting. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2022; 29:4958-4990. [PMID: 34807385 DOI: 10.1007/s11356-021-17442-1] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/19/2021] [Accepted: 11/05/2021] [Indexed: 06/13/2023]
Abstract
Rapid progress of industrial development, urbanization and traffic has caused air quality reduction that negatively affects human health and environmental sustainability, especially among developed countries. Numerous studies on the development of air quality forecasting model using machine learning have been conducted to control air pollution. As such, there are significant numbers of reviews on the application of machine learning in air quality forecasting. Shallow architectures of machine learning exhibit several limitations and yield lower forecasting accuracy than deep learning architecture. Deep learning is a new technology in computational intelligence; thus, its application in air quality forecasting is still limited. This study aims to investigate the deep learning applications in time series air quality forecasting. Owing to this, literature search is conducted thoroughly from all scientific databases to avoid unnecessary clutter. This study summarizes and discusses different types of deep learning algorithms applied in air quality forecasting, including the theoretical backgrounds, hyperparameters, applications and limitations. Hybrid deep learning with data decomposition, optimization algorithm and spatiotemporal models are also presented to highlight those techniques' effectiveness in tackling the drawbacks of individual deep learning models. It is clearly stated that hybrid deep learning was able to forecast future air quality with higher accuracy than individual models. At the end of the study, some possible research directions are suggested for future model development. The main objective of this review study is to provide a comprehensive literature summary of deep learning applications in time series air quality forecasting that may benefit interested researchers for subsequent research.
Collapse
Affiliation(s)
- Nur'atiah Zaini
- Institute of Sustainable Energy, Universiti Tenaga Nasional, Selangor, Malaysia.
| | - Lee Woen Ean
- Institute of Sustainable Energy, Universiti Tenaga Nasional, Selangor, Malaysia
| | - Ali Najah Ahmed
- Institute of Energy Infrastructure, Universiti Tenaga Nasional, Selangor, Malaysia
| | | |
Collapse
|
30
|
Bekkar A, Hssina B, Douzi S, Douzi K. Air-pollution prediction in smart city, deep learning approach. JOURNAL OF BIG DATA 2021; 8:161. [PMID: 34956819 PMCID: PMC8693596 DOI: 10.1186/s40537-021-00548-1] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/21/2021] [Accepted: 12/10/2021] [Indexed: 06/14/2023]
Abstract
Over the past few decades, due to human activities, industrialization, and urbanization, air pollution has become a life-threatening factor in many countries around the world. Among air pollutants, Particulate Matter with a diameter of less than 2.5 μ m ( P M 2.5 ) is a serious health problem. It causes various illnesses such as respiratory tract and cardiovascular diseases. Hence, it is necessary to accurately predict the P M 2.5 concentrations in order to prevent the citizens from the dangerous impact of air pollution beforehand. The variation of P M 2.5 depends on a variety of factors, such as meteorology and the concentration of other pollutants in urban areas. In this paper, we implemented a deep learning solution to predict the hourly forecast of P M 2.5 concentration in Beijing, China, based on CNN-LSTM, with a spatial-temporal feature by combining historical data of pollutants, meteorological data, and P M 2.5 concentration in the adjacent stations. We examined the difference in performances among Deep learning algorithms such as LSTM, Bi-LSTM, GRU, Bi-GRU, CNN, and a hybrid CNN-LSTM model. Experimental results indicate that our method "hybrid CNN-LSTM multivariate" enables more accurate predictions than all the listed traditional models and performs better in predictive performance.
Collapse
Affiliation(s)
| | - Badr Hssina
- FSTM, University Hassan II, Casablanca, Morocco
| | | | | |
Collapse
|
31
|
Yu Z, Jang M, Madhu A. Prediction of Phase State of Secondary Organic Aerosol Internally Mixed with Aqueous Inorganic Salts. J Phys Chem A 2021; 125:10198-10206. [PMID: 34797662 DOI: 10.1021/acs.jpca.1c06773] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
In the presence of inorganic salts, secondary organic aerosol (SOA) undergoes liquid-liquid phase separation (LLPS), liquid-solid phase separation, or a homogeneous phase in ambient air. In this study, a regression model was derived to predict aerosol phase separation relative humidity (SRH) for various organic and inorganic mixes. The model implemented organic physicochemical parameters (i.e., oxygen to carbon ratio, molecular weight, and hydrogen-bonding ability) and the parameters related to inorganic compositions (i.e., ammonium, sulfate, nitrate, and water). The aerosol phase data were observed using an optical microscope and also collected from the literature. The crystallization of aerosols at the effloresce RH (ERH) was semiempirically predicted with a neural network model. Overall, the greater SRH appeared for the organic compounds with the lower oxygen to carbon ratios or the greater molecular weight and the higher aerosol acidity or the larger fraction of inorganic nitrate led to the lower SRH. The resulting model has been demonstrated for three different chamber-generated SOA (originated from β-pinene, toluene, and 1,3,5-trimethylbenzene), which were internally mixed with the inorganic aqueous system of ammonium-sulfate-water. For all three SOA systems, both observations and model predictions showed LLPS at RH <80%. In the urban atmosphere, LLPS is likely a frequent occurrence for the typical anthropogenic SOA, which originates from aromatic and alkane hydrocarbon.
Collapse
Affiliation(s)
- Zechen Yu
- Department of Environmental Engineering Sciences, University of Florida, Gainesville, Florida 32611, United States
| | - Myoseon Jang
- Department of Environmental Engineering Sciences, University of Florida, Gainesville, Florida 32611, United States
| | - Azad Madhu
- Department of Environmental Engineering Sciences, University of Florida, Gainesville, Florida 32611, United States
| |
Collapse
|
32
|
Espinosa R, Palma J, Jiménez F, Kamińska J, Sciavicco G, Lucena-Sánchez E. A time series forecasting based multi-criteria methodology for air quality prediction. Appl Soft Comput 2021. [DOI: 10.1016/j.asoc.2021.107850] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
33
|
Popa CL, Dobrescu TG, Silvestru CI, Firulescu AC, Popescu CA, Cotet CE. Pollution and Weather Reports: Using Machine Learning for Combating Pollution in Big Cities. SENSORS 2021; 21:s21217329. [PMID: 34770634 PMCID: PMC8586941 DOI: 10.3390/s21217329] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Revised: 10/31/2021] [Accepted: 11/01/2021] [Indexed: 02/01/2023]
Abstract
Air pollution has become the most important issue concerning human evolution in the last century, as the levels of toxic gases and particles present in the air create health problems and affect the ecosystems of the planet. Scientists and environmental organizations have been looking for new ways to combat and control the air pollution, developing new solutions as technologies evolves. In the last decade, devices able to observe and maintain pollution levels have become more accessible and less expensive, and with the appearance of the Internet of Things (IoT), new approaches for combating pollution were born. The focus of the research presented in this paper was predicting behaviours regarding the air quality index using machine learning. Data were collected from one of the six atmospheric stations set in relevant areas of Bucharest, Romania, to validate our model. Several algorithms were proposed to study the evolution of temperature depending on the level of pollution and on several pollution factors. In the end, the results generated by the algorithms are presented considering the types of pollutants for two distinct periods. Prediction errors were highlighted by the RMSE (Root Mean Square Error) for each of the three machine learning algorithms used.
Collapse
|
34
|
Colorado Cifuentes GU, Flores Tlacuahuac A. A short‐term deep learning model for urban pollution forecasting with incomplete data. CAN J CHEM ENG 2021. [DOI: 10.1002/cjce.23957] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
35
|
A New Multi-Scale Sliding Window LSTM Framework (MSSW-LSTM): A Case Study for GNSS Time-Series Prediction. REMOTE SENSING 2021. [DOI: 10.3390/rs13163328] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
GNSS time-series prediction plays an important role in the monitoring of crustal plate movement, and dam or bridge deformation, and the maintenance of global or regional coordinate frames. Deep learning is a state-of-the-art approach for extracting high-level abstract features from big data without any prior knowledge. Moreover, long short-term memory (LSTM) networks are a form of recurrent neural networks that have significant potential for processing time series. In this study, a novel prediction framework was proposed by combining a multi-scale sliding window (MSSW) with LSTM. Specifically, MSSW was applied for data preprocessing to effectively extract the feature relationship at different scales and simultaneously mine the deep characteristics of the dataset. Then, multiple LSTM neural networks were used to predict and obtain the final result by weighting. To verify the performance of MSSW-LSTM, 1000 daily solutions of the XJSS station in the Up component were selected for prediction experiments. Compared with the traditional LSTM method, our results of three groups of controlled experiments showed that the RMSE value was reduced by 2.1%, 23.7%, and 20.1%, and MAE was decreased by 1.6%, 21.1%, and 22.2%, respectively. Our results showed that the MSSW-LSTM algorithm can achieve higher prediction accuracy and smaller error, and can be applied to GNSS time-series prediction.
Collapse
|
36
|
Matasović B, Pehnec G, Bešlić I, Davila S, Babić D. Assessment of ozone concentration data from the northern Zagreb area, Croatia, for the period from 2003 to 2016. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2021; 28:36640-36650. [PMID: 33704644 DOI: 10.1007/s11356-021-13295-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/19/2020] [Accepted: 03/01/2021] [Indexed: 06/12/2023]
Abstract
A measurement station located in an urban area on the southern slope of the Medvednica Mountain (120 m a.s.l.), close to the Croatian capital Zagreb, provided data for an analysis of the photosmog in the city of Zagreb. Data for the period 2003-2016 obtained from this station and analysed in this work can also be compared with the nearby Puntijarka station (980 m a.s.l.) for which a similar analysis has already been carried out. In Puntijarka station analysis, it has been shown that there is most probably no significant change in ozone concentrations during the observed period. In this study the mean value of the annual ozone volume fractions showed a linear trend of 0.23 ppb yr-1, a growth that is in the worst case scenario among the lowest global prediction, while the seasonal (April-to-September) mean values had a trend of 0.32 ppb yr-1, which is a certain clearly observable growth. The 95-percentile values had trends of 0.009 ppb yr-1 (annual data) and -0.072 ppb yr-1 (seasonal data), respectively. Both of these values show very small changes if any at all. By using FT analysis, with the calculation of uncertainties, we have observed three prominent cycles of 169 ± 4 h (weekly cycle), 24 ± 1 h and 12 ± 1 h (diurnal cycles). Uncertainties were low which strongly indicate that the cycles are present. However, since high concentrations of ozone were observed only sporadically, ozone pollution in the northern part of Zagreb is at the present rather low. A Fourier transformation was used to analyse the data for periodic behaviour, which revealed the existence of diurnal and weekly modulations. Nevertheless, constant monitoring is important and will continue in the future as part of continuous monitoring of the ozone levels in the area.
Collapse
Affiliation(s)
- Brunislav Matasović
- Josip Juraj Strossmayer University of Osijek, Ulica cara Hadrijana 8a, HR-31000, Osijek, Croatia.
| | - Gordana Pehnec
- Institute for Medical Research and Occupational Health, Ksaverska cesta 2, HR-10000, Zagreb, Croatia
| | - Ivan Bešlić
- Institute for Medical Research and Occupational Health, Ksaverska cesta 2, HR-10000, Zagreb, Croatia
| | - Silvije Davila
- Institute for Medical Research and Occupational Health, Ksaverska cesta 2, HR-10000, Zagreb, Croatia
| | - Dinko Babić
- Institute for Medical Research and Occupational Health, Ksaverska cesta 2, HR-10000, Zagreb, Croatia
| |
Collapse
|
37
|
Qiao W, Wang Y, Zhang J, Tian W, Tian Y, Yang Q. An innovative coupled model in view of wavelet transform for predicting short-term PM10 concentration. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2021; 289:112438. [PMID: 33872873 DOI: 10.1016/j.jenvman.2021.112438] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/05/2019] [Revised: 02/19/2021] [Accepted: 03/19/2021] [Indexed: 06/12/2023]
Abstract
Wavelet transform (WT) is an advanced preprocessing technique, which has been widely used in PM 10 prediction. However, this technique cannot provide stable performance due to the empirical selection of wavelet's layers. For fixing the optimal wavelet's layers in PM10 forecasting, an innovative coupled model based on WT, long short-term memory (LSTM), and SAE (stacked autoencoder) are proposed. This study designs a crossover experiment with 960 high- and low-frequency components by wavelet decomposition and predicts each component with SAE-LSTM based on 12 samples from different regions. The results indicate that the developed model outperforms other BiLSTM (Biredictional LSTM) and LSTM based on some error evaluation indicators (i.e. Nash-Sutcliffe efficiency coefficient (NSEC)), and compared with other steps, the accuracy of two-step prediction is the highest in view of root mean squares error (RMSE). In addition, for 12 samples, the prediction accuracy by using high layers is higher than that by adopting low layers for decomposing them. This paper fixes the optimal wavelet' layers in PM10 prediction, which provides a meaningful reference in other prediction scenarios based on the application of WT.
Collapse
Affiliation(s)
- Weibiao Qiao
- School of Vehicle and Energy, Yan Shan University, Qinhuangdao, 066004, China; School of Environmental and Municipal Engineering, North China University of Water Resources and Electric Power, Zhengzhou, 450046, China
| | - Yining Wang
- School of Environmental and Municipal Engineering, North China University of Water Resources and Electric Power, Zhengzhou, 450046, China
| | - Jianzhuang Zhang
- School of Environmental and Municipal Engineering, North China University of Water Resources and Electric Power, Zhengzhou, 450046, China
| | - Wencai Tian
- School of Environmental and Municipal Engineering, North China University of Water Resources and Electric Power, Zhengzhou, 450046, China
| | - Yu Tian
- School of Environmental and Municipal Engineering, North China University of Water Resources and Electric Power, Zhengzhou, 450046, China
| | - Quan Yang
- Faculty of Architecture, Civil and Transportation Engineering, Beijing University of Technology, Beijing, 100124, China.
| |
Collapse
|
38
|
Modeling and Analysis of Data-Driven Systems through Computational Neuroscience Wavelet-Deep Optimized Model for Nonlinear Multicomponent Data Forecasting. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2021; 2021:8810046. [PMID: 34234823 PMCID: PMC8216800 DOI: 10.1155/2021/8810046] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/03/2020] [Accepted: 05/26/2021] [Indexed: 11/18/2022]
Abstract
Complex time series data exists widely in actual systems, and its forecasting has great practical significance. Simultaneously, the classical linear model cannot obtain satisfactory performance due to nonlinearity and multicomponent characteristics. Based on the data-driven mechanism, this paper proposes a deep learning method coupled with Bayesian optimization based on wavelet decomposition to model the time series data and forecasting its trend. Firstly, the data is decomposed by wavelet transform to reduce the complexity of the time series data. The Gated Recurrent Unit (GRU) network is trained as a submodel for each decomposition component. The hyperparameters of wavelet decomposition and each submodel are optimized with Bayesian sequence model-based optimization (SMBO) to develop the modeling accuracy. Finally, the results of all submodels are added to obtain forecasting results. The PM2.5 data collected by the US Air Quality Monitoring Station is used for experiments. By comparing with other networks, it can be found that the proposed method outperforms well in the multisteps forecasting task for the complex time series.
Collapse
|
39
|
Nguyen H, Tran K, Thomassey S, Hamad M. Forecasting and Anomaly Detection approaches using LSTM and LSTM Autoencoder techniques with the applications in supply chain management. INTERNATIONAL JOURNAL OF INFORMATION MANAGEMENT 2021. [DOI: 10.1016/j.ijinfomgt.2020.102282] [Citation(s) in RCA: 41] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
40
|
Liu H, Yan G, Duan Z, Chen C. Intelligent modeling strategies for forecasting air quality time series: A review. Appl Soft Comput 2021. [DOI: 10.1016/j.asoc.2020.106957] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
41
|
High granular and short term time series forecasting of $$\hbox {PM}_{2.5}$$ air pollutant - a comparative review. Artif Intell Rev 2021. [DOI: 10.1007/s10462-021-09991-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
42
|
Shi G, Leung Y, Zhang JS, Fung T, Du F, Zhou Y. A novel method for identifying hotspots and forecasting air quality through an adaptive utilization of spatio-temporal information of multiple factors. THE SCIENCE OF THE TOTAL ENVIRONMENT 2021; 759:143513. [PMID: 33246725 DOI: 10.1016/j.scitotenv.2020.143513] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/17/2020] [Revised: 10/22/2020] [Accepted: 10/28/2020] [Indexed: 06/12/2023]
Abstract
Air pollution exerts serious impacts on human health and sustainable development. The accurate forecasting of air quality can guide the formulation of mitigation strategies and reduce exposure to air pollution. It is beneficial to explicitly consider both spatial and temporal information of multiple factors, e.g., the meteorological data, in the forecasting of air pollutant concentrations. The temporal information of relevant factors collected at a location should be considered for forecasting. In addition, these factors recorded at other locations may also provide useful information. Existing methods utilizing the spatio-temporal information of these relevant factors are usually based on some very complicated frameworks. In this study, we propose a novel and simple spatial attention-based long short-term memory (SA-LSTM) that combines LSTM and a spatial attention mechanism to adaptively utilize the spatio-temporal information of multiple factors for forecasting air pollutant concentrations. Specifically, the SA-LSTM employs gated recurrent connections to extract temporal information of multiple factors at individual locations, and the spatial attention mechanism to spatially fuse the temporal information extracted at these locations. This method is effective and applicable to forecast any air pollutant concentrations when spatio-temporal information of relevant factors has to be utilized. To validate the effectiveness of the proposed SA-LSTM, we apply it to forecast the daily air quality in Hong Kong, a high density city with peculiar cityscapes, by using the air quality and meteorological data. Empirical results demonstrate that the proposed SA-LSTM outperforms the conventional models with respect to one-day forecast accuracy, especially for extreme values. Moreover, the attention weights learned by the SA-LSTM can identify hotspots of the air pollution process for reducing computational complexity of forecasting and provide a better understanding of the underlying mechanism of air pollution.
Collapse
Affiliation(s)
- Guang Shi
- School of Mathematics and Statistics, Xi'an Jiaotong University, Xi'an, Shaanxi 710049, China; Institute of Future Cities, The Chinese University of Hong Kong, Shatin, Hong Kong, China
| | - Yee Leung
- Institute of Future Cities, The Chinese University of Hong Kong, Shatin, Hong Kong, China; Department of Geography and Resource Management, The Chinese University of Hong Kong, Shatin, Hong Kong, China
| | - Jiang She Zhang
- School of Mathematics and Statistics, Xi'an Jiaotong University, Xi'an, Shaanxi 710049, China
| | - Tung Fung
- Institute of Future Cities, The Chinese University of Hong Kong, Shatin, Hong Kong, China; Department of Geography and Resource Management, The Chinese University of Hong Kong, Shatin, Hong Kong, China
| | - Fang Du
- Department of Mathematics and Information Science, Faculty of Science, Chang'an University, Xi'an, ShaanXi 710064, China
| | - Yu Zhou
- Institute of Future Cities, The Chinese University of Hong Kong, Shatin, Hong Kong, China; Department of Geography and Resource Management, The Chinese University of Hong Kong, Shatin, Hong Kong, China.
| |
Collapse
|
43
|
A Review of Recent Machine Learning Advances for Forecasting Harmful Algal Blooms and Shellfish Contamination. JOURNAL OF MARINE SCIENCE AND ENGINEERING 2021. [DOI: 10.3390/jmse9030283] [Citation(s) in RCA: 35] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Harmful algal blooms (HABs) are among the most severe ecological marine problems worldwide. Under favorable climate and oceanographic conditions, toxin-producing microalgae species may proliferate, reach increasingly high cell concentrations in seawater, accumulate in shellfish, and threaten the health of seafood consumers. There is an urgent need for the development of effective tools to help shellfish farmers to cope and anticipate HAB events and shellfish contamination, which frequently leads to significant negative economic impacts. Statistical and machine learning forecasting tools have been developed in an attempt to better inform the shellfish industry to limit damages, improve mitigation measures and reduce production losses. This study presents a synoptic review covering the trends in machine learning methods for predicting HABs and shellfish biotoxin contamination, with a particular focus on autoregressive models, support vector machines, random forest, probabilistic graphical models, and artificial neural networks (ANN). Most efforts have been attempted to forecast HABs based on models of increased complexity over the years, coupled with increased multi-source data availability, with ANN architectures in the forefront to model these events. The purpose of this review is to help defining machine learning-based strategies to support shellfish industry to manage their harvesting/production, and decision making by governmental agencies with environmental responsibilities.
Collapse
|
44
|
Artificial Neural Networks, Sequence-to-Sequence LSTMs, and Exogenous Variables as Analytical Tools for NO 2 (Air Pollution) Forecasting: A Case Study in the Bay of Algeciras (Spain). SENSORS 2021; 21:s21051770. [PMID: 33806409 PMCID: PMC7961900 DOI: 10.3390/s21051770] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/11/2021] [Revised: 02/22/2021] [Accepted: 02/28/2021] [Indexed: 11/17/2022]
Abstract
This study aims to produce accurate predictions of the NO2 concentrations at a specific station of a monitoring network located in the Bay of Algeciras (Spain). Artificial neural networks (ANNs) and sequence-to-sequence long short-term memory networks (LSTMs) were used to create the forecasting models. Additionally, a new prediction method was proposed combining LSTMs using a rolling window scheme with a cross-validation procedure for time series (LSTM-CVT). Two different strategies were followed regarding the input variables: using NO2 from the station or employing NO2 and other pollutants data from any station of the network plus meteorological variables. The ANN and LSTM-CVT exogenous models used lagged datasets of different window sizes. Several feature ranking methods were used to select the top lagged variables and include them in the final exogenous datasets. Prediction horizons of t + 1, t + 4 and t + 8 were employed. The exogenous variables inclusion enhanced the model's performance, especially for t + 4 (ρ ≈ 0.68 to ρ ≈ 0.74) and t + 8 (ρ ≈ 0.59 to ρ ≈ 0.66). The proposed LSTM-CVT method delivered promising results as the best performing models per prediction horizon employed this new methodology. Additionally, per each parameter combination, it obtained lower error values than ANNs in 85% of the cases.
Collapse
|
45
|
Tu T, Xu K, Xu L, Gao Y, Zhou Y, He Y, Liu Y, Liu Q, Ji H, Tang W. Association between meteorological factors and the prevalence dynamics of Japanese encephalitis. PLoS One 2021; 16:e0247980. [PMID: 33657174 PMCID: PMC7928514 DOI: 10.1371/journal.pone.0247980] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2020] [Accepted: 02/17/2021] [Indexed: 12/29/2022] Open
Abstract
Japanese encephalitis (JE) is an acute infectious disease caused by the Japanese encephalitis virus (JEV) and is transmitted by mosquitoes. Meteorological conditions are known to play a pivotal role in the spread of JEV. In this study, a zero-inflated generalised additive model and a long short-term memory model were used to assess the relationship between the meteorological factors and population density of Culex tritaeniorhynchus as well as the incidence of JE and to predict the prevalence dynamics of JE, respectively. The incidence of JE in the previous month, the mean air temperature and the average of relative humidity had positive effects on the outbreak risk and intensity. Meanwhile, the density of all mosquito species in livestock sheds (DMSL) only affected the outbreak risk. Moreover, the region-specific prediction model of JE was developed in Chongqing by used the Long Short-Term Memory Neural Network. Our study contributes to a better understanding of the JE dynamics and helps the local government establish precise prevention and control measures.
Collapse
Affiliation(s)
- Taotian Tu
- Chongqing Municipal Center for Disease Control and Prevention, Chongqing, China
| | - Keqiang Xu
- College of Computer and Information Engineering, Henan Normal University, Xinxiang, Henan Province, China
| | - Lei Xu
- State Key Laboratory of Infectious Disease Prevention and Control, National Institute for Communicable Disease Control and Prevention, Chinese Center for Disease Control and Prevention, Beijing, China
- Department of Earth System Science, Ministry of Education Key Laboratory for Earth System Modeling, Tsinghua University, Beijing, China
| | - Yuan Gao
- State Key Laboratory of Infectious Disease Prevention and Control, National Institute for Communicable Disease Control and Prevention, Chinese Center for Disease Control and Prevention, Beijing, China
| | - Ying Zhou
- Chongqing Municipal Center for Disease Control and Prevention, Chongqing, China
| | - Yaming He
- Chongqing Municipal Center for Disease Control and Prevention, Chongqing, China
| | - Yang Liu
- Chongqing Municipal Center for Disease Control and Prevention, Chongqing, China
| | - Qiyong Liu
- State Key Laboratory of Infectious Disease Prevention and Control, National Institute for Communicable Disease Control and Prevention, Chinese Center for Disease Control and Prevention, Beijing, China
| | - Hengqing Ji
- Chongqing Municipal Center for Disease Control and Prevention, Chongqing, China
- * E-mail: (WT); (HJ)
| | - Wenge Tang
- Chongqing Municipal Center for Disease Control and Prevention, Chongqing, China
- * E-mail: (WT); (HJ)
| |
Collapse
|
46
|
Abstract
Air pollution and its consequences are negatively impacting on the world population and the environment, which converts the monitoring and forecasting air quality techniques as essential tools to combat this problem. To predict air quality with maximum accuracy, along with the implemented models and the quantity of the data, it is crucial also to consider the dataset types. This study selected a set of research works in the field of air quality prediction and is concentrated on the exploration of the datasets utilised in them. The most significant findings of this research work are: (1) meteorological datasets were used in 94.6% of the papers leaving behind the rest of the datasets with a big difference, which is complemented with others, such as temporal data, spatial data, and so on; (2) the usage of various datasets combinations has been commenced since 2009; and (3) the utilisation of open data have been started since 2012, 32.3% of the studies used open data, and 63.4% of the studies did not provide the data.
Collapse
|
47
|
Oh ST, Ga DH, Lim JH. Mobile Deep Learning System That Calculates UVI Using Illuminance Value of User's Location. SENSORS 2021; 21:s21041227. [PMID: 33572393 PMCID: PMC7916185 DOI: 10.3390/s21041227] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/30/2020] [Revised: 01/31/2021] [Accepted: 02/05/2021] [Indexed: 12/23/2022]
Abstract
Ultraviolet rays are closely related with human health and, recently, optimum exposure to the UV rays has been recommended, with growing importance being placed on correct UV information. However, many countries provide UV information services at a local level, which makes it impossible for individuals to acquire user-based, accurate UV information unless individuals operate UV measurement devices with expertise on the relevant field for interpretation of the measurement results. There is a limit in measuring ultraviolet rays’ information by the users at their respective locations. Research about how to utilize mobile devices such as smartphones to overcome such limitation is also lacking. This paper proposes a mobile deep learning system that calculates UVI based on the illuminance values at the user’s location obtained with mobile devices’ help. The proposed method analyzed the correlation between illuminance and UVI based on the natural light DB collected through the actual measurements, and the deep learning model’s data set was extracted. After the selection of the input variables to calculate the correct UVI, the deep learning model based on the TensorFlow set with the optimum number of layers and number of nodes was designed and implemented, and learning was executed via the data set. After the data set was converted to the mobile deep learning model to operate under the mobile environment, the converted data were loaded on the mobile device. The proposed method enabled providing UV information at the user’s location through a mobile device on which the illuminance sensors were loaded even in the environment without UVI measuring equipment. The comparison of the experiment results with the reference device (spectrometer) proved that the proposed method could provide UV information with an accuracy of 90–95% in the summers, as well as in winters.
Collapse
Affiliation(s)
- Seung-Taek Oh
- Smart Natural Space Research Center, Kongju National University, Cheonan 31080, Korea;
| | - Deog-Hyeon Ga
- Department of Computer Science & Engineering, Kongju National University, Cheonan 31080, Korea;
| | - Jae-Hyun Lim
- Department of Computer Science & Engineering, Kongju National University, Cheonan 31080, Korea;
- Department of Urban Systems Engineering, Kongju National University, Cheonan 31080, Korea
- Correspondence: ; Tel.: +82-10-8864-6195
| |
Collapse
|
48
|
Torres JF, Hadjout D, Sebaa A, Martínez-Álvarez F, Troncoso A. Deep Learning for Time Series Forecasting: A Survey. BIG DATA 2021; 9:3-21. [PMID: 33275484 DOI: 10.1089/big.2020.0159] [Citation(s) in RCA: 53] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Time series forecasting has become a very intensive field of research, which is even increasing in recent years. Deep neural networks have proved to be powerful and are achieving high accuracy in many application fields. For these reasons, they are one of the most widely used methods of machine learning to solve problems dealing with big data nowadays. In this work, the time series forecasting problem is initially formulated along with its mathematical fundamentals. Then, the most common deep learning architectures that are currently being successfully applied to predict time series are described, highlighting their advantages and limitations. Particular attention is given to feed forward networks, recurrent neural networks (including Elman, long-short term memory, gated recurrent units, and bidirectional networks), and convolutional neural networks. Practical aspects, such as the setting of values for hyper-parameters and the choice of the most suitable frameworks, for the successful application of deep learning to time series are also provided and discussed. Several fruitful research fields in which the architectures analyzed have obtained a good performance are reviewed. As a result, research gaps have been identified in the literature for several domains of application, thus expecting to inspire new and better forms of knowledge.
Collapse
Affiliation(s)
- José F Torres
- Data Science and Big Data Lab, Pablo de Olavide University, Seville, Spain
| | - Dalil Hadjout
- Department of Commerce, SADEG Company (Sonelgaz Group), Bejaia, Algeria
| | - Abderrazak Sebaa
- LIMED Laboratory, Faculty of Exact Sciences, University of Bejaia, Bejaia, Algeria
- Higher School of Sciences and Technologies of Computing and Digital, Bejaia, Algeria
| | | | - Alicia Troncoso
- Data Science and Big Data Lab, Pablo de Olavide University, Seville, Spain
| |
Collapse
|
49
|
Shatnawi N, Abu-Qdais H. Assessing and predicting air quality in northern Jordan during the lockdown due to the COVID-19 virus pandemic using artificial neural network. AIR QUALITY, ATMOSPHERE, & HEALTH 2021; 14:643-652. [PMID: 33520010 PMCID: PMC7831622 DOI: 10.1007/s11869-020-00968-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/14/2020] [Accepted: 11/25/2020] [Indexed: 05/16/2023]
Abstract
This study deals with the simulation and prediction of air pollutants in Irbid city (north of Jordan) before and during the spread of the COVID-19 virus pandemic by using an artificial neural network (ANN). Based on the data obtained from the air quality monitoring station for the year 2019 and the first quarter of the year 2020, it was possible to develop an ANN model to simulate and predict the concentrations of three air pollutants, namely nitrogen dioxide (NO2), sulfur dioxide (SO2), and particulate matter with diameter less than 10 μm (PM10). Several ANN model configurations were tested to select the best model that could predict the concentration of the three air pollutants with meteorological parameters being used as input to the model. The results showed that the concentration of the pollutants during the coronavirus lockdown was declined by various percentages (from 29% for PM10 to 72% for NO2) as compared to their concentration before the pandemic period. Furthermore, the developed ANN model could simulate and predict the concentration of the pollutants during the pandemic period with sufficient accuracy as judged by the values of the coefficient of determination and the mean square error. The study results indicate that properly trained and structured ANN can be a useful tool to predict air quality parameters with adequate accuracy.
Collapse
Affiliation(s)
- Nawras Shatnawi
- Surveying and Geomatics Engineering Department, Al-Balqa Applied University, Al-Salt, 19117 Jordan
| | - Hani Abu-Qdais
- Civil Engineering Department, Jordan University of Science and Technology, P.O. Box 3030, Irbid, 22110 Jordan
| |
Collapse
|
50
|
PM2.5 Prediction Model Based on Combinational Hammerstein Recurrent Neural Networks. MATHEMATICS 2020. [DOI: 10.3390/math8122178] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Airborne particulate matter 2.5 (PM2.5) can have a profound effect on the health of the population. Many researchers have been reporting highly accurate numerical predictions based on raw PM2.5 data imported directly into deep learning models; however, there is still considerable room for improvement in terms of implementation costs due to heavy computational overhead. From the perspective of environmental science, PM2.5 values in a given location can be attributed to local sources as well as external sources. Local sources tend to have a dramatic short-term impact on PM2.5 values, whereas external sources tend to have more subtle but longer-lasting effects. In the presence of PM2.5 from both sources at the same time, this combination of effects can undermine the predictive accuracy of the model. This paper presents a novel combinational Hammerstein recurrent neural network (CHRNN) to enhance predictive accuracy and overcome the heavy computational and monetary burden imposed by deep learning models. The CHRNN comprises a based-neural network tasked with learning gradual (long-term) fluctuations in conjunction with add-on neural networks to deal with dramatic (short-term) fluctuations. The CHRNN can be coupled with a random forest model to determine the degree to which short-term effects influence long-term outcomes. We also developed novel feature selection and normalization methods to enhance prediction accuracy. Using real-world measurement data of air quality and PM2.5 datasets from Taiwan, the precision of the proposed system in the numerical prediction of PM2.5 levels was comparable to that of state-of-the-art deep learning models, such as deep recurrent neural networks and long short-term memory, despite far lower implementation costs and computational overhead.
Collapse
|