1
|
Liao Q, Zhu M, Wu L, Wang D, Wang Z, Zhang S, Cao W, Pan X, Li J, Tang X, Xin J, Sun Y, Zhu J, Wang Z. Probing the capacity of a spatiotemporal deep learning model for short-term PM 2.5 forecasts in a coastal urban area. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 950:175233. [PMID: 39102955 DOI: 10.1016/j.scitotenv.2024.175233] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/29/2024] [Revised: 07/22/2024] [Accepted: 07/31/2024] [Indexed: 08/07/2024]
Abstract
Accurate forecast of fine particulate matter (PM2.5) is crucial for city air pollution control, yet remains challenging due to the complex urban atmospheric chemical and physical processes. Recently deep learning has been routinely applied for better urban PM2.5 forecasts. However, their capacity to represent the spatiotemporal urban atmospheric processes remains underexplored, especially compared with traditional approaches such as chemistry-transport models (CTMs) and shallow statistical methods other than deep learning. Here we probe such urban-scale representation capacity of a spatiotemporal deep learning (STDL) model for 24-hour short-term PM2.5 forecasts at six urban stations in Rizhao, a coastal city in China. Compared with two operational CTMs and three statistical models, the STDL model shows its superiority with improvements in all five evaluation metrics, notably in root mean square error (RMSE) for forecasts at lead times within 12 h with reductions of 49.8 % and 47.8 % respectively. This demonstrates the STDL model's capacity to represent nonlinear small-scale phenomena such as street-level emissions and urban meteorology that are in general not well represented in either CTMs or shallow statistical models. This gain of small-scale representation in forecast performance decreases at increasing lead times, leading to similar RMSEs to the statistical methods (linear shallow representations) at about 12 h and to the CTMs (mesoscale representations) at 24 h. The STDL model performs especially well in winter, when complex urban physical and chemical processes dominate the frequent severe air pollution, and in moisture conditions fostering hygroscopic growth of particles. The DL-based PM2.5 forecasts align with observed trends under various humidity and wind conditions. Such investigation into the potential and limitations of deep learning representation for urban PM2.5 forecasting could hopefully inspire further fusion of distinct representations from CTMs and deep networks to break the conventional limits of short-term PM2.5 forecasts.
Collapse
Affiliation(s)
- Qi Liao
- College of Electronic Engineering, Chengdu University of Information Technology, Chengdu 610225, China
| | - Mingming Zhu
- Key Laboratory of Atmospheric Environment and Extreme Meteorology, Institute of Atmospheric Physics, Chinese Academy of Sciences, Beijing 100029, China.
| | - Lin Wu
- Key Laboratory of Atmospheric Environment and Extreme Meteorology, Institute of Atmospheric Physics, Chinese Academy of Sciences, Beijing 100029, China; Carbon Neutrality Research Center (CNRC), Institute of Atmospheric Physics, Chinese Academy of Sciences, Beijing 100029, China
| | - Dawei Wang
- State Key Laboratory of Atmospheric Boundary Layer Physics and Atmospheric Chemistry (LAPC), Institute of Atmospheric Physics, Chinese Academy of Sciences, Beijing 100029, China
| | - Zixi Wang
- State Key Laboratory of Atmospheric Boundary Layer Physics and Atmospheric Chemistry (LAPC), Institute of Atmospheric Physics, Chinese Academy of Sciences, Beijing 100029, China; College of Earth Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Si Zhang
- Carbon Neutrality Research Center (CNRC), Institute of Atmospheric Physics, Chinese Academy of Sciences, Beijing 100029, China; College of Earth Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Wudi Cao
- Carbon Neutrality Research Center (CNRC), Institute of Atmospheric Physics, Chinese Academy of Sciences, Beijing 100029, China
| | - Xiaole Pan
- Key Laboratory of Atmospheric Environment and Extreme Meteorology, Institute of Atmospheric Physics, Chinese Academy of Sciences, Beijing 100029, China; State Key Laboratory of Atmospheric Boundary Layer Physics and Atmospheric Chemistry (LAPC), Institute of Atmospheric Physics, Chinese Academy of Sciences, Beijing 100029, China
| | - Jie Li
- Key Laboratory of Atmospheric Environment and Extreme Meteorology, Institute of Atmospheric Physics, Chinese Academy of Sciences, Beijing 100029, China; State Key Laboratory of Atmospheric Boundary Layer Physics and Atmospheric Chemistry (LAPC), Institute of Atmospheric Physics, Chinese Academy of Sciences, Beijing 100029, China
| | - Xiao Tang
- Key Laboratory of Atmospheric Environment and Extreme Meteorology, Institute of Atmospheric Physics, Chinese Academy of Sciences, Beijing 100029, China; State Key Laboratory of Atmospheric Boundary Layer Physics and Atmospheric Chemistry (LAPC), Institute of Atmospheric Physics, Chinese Academy of Sciences, Beijing 100029, China
| | - Jinyuan Xin
- Key Laboratory of Atmospheric Environment and Extreme Meteorology, Institute of Atmospheric Physics, Chinese Academy of Sciences, Beijing 100029, China; State Key Laboratory of Atmospheric Boundary Layer Physics and Atmospheric Chemistry (LAPC), Institute of Atmospheric Physics, Chinese Academy of Sciences, Beijing 100029, China
| | - Yele Sun
- Key Laboratory of Atmospheric Environment and Extreme Meteorology, Institute of Atmospheric Physics, Chinese Academy of Sciences, Beijing 100029, China; State Key Laboratory of Atmospheric Boundary Layer Physics and Atmospheric Chemistry (LAPC), Institute of Atmospheric Physics, Chinese Academy of Sciences, Beijing 100029, China; College of Earth Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Jiang Zhu
- Key Laboratory of Atmospheric Environment and Extreme Meteorology, Institute of Atmospheric Physics, Chinese Academy of Sciences, Beijing 100029, China; International Center for Climate and Environment Science (ICCES), Institute of Atmospheric Physics, Chinese Academy of Sciences, Beijing 100029, China
| | - Zifa Wang
- Key Laboratory of Atmospheric Environment and Extreme Meteorology, Institute of Atmospheric Physics, Chinese Academy of Sciences, Beijing 100029, China; State Key Laboratory of Atmospheric Boundary Layer Physics and Atmospheric Chemistry (LAPC), Institute of Atmospheric Physics, Chinese Academy of Sciences, Beijing 100029, China; College of Earth Sciences, University of Chinese Academy of Sciences, Beijing 100049, China; Center for Excellence in Urban Atmospheric Environment, Institute of Urban Environment, Chinese Academy of Sciences, Xiamen 361021, China
| |
Collapse
|
2
|
Hu Y, Li Q, Shi X, Yan J, Chen Y. Domain knowledge-enhanced multi-spatial multi-temporal PM 2.5 forecasting with integrated monitoring and reanalysis data. ENVIRONMENT INTERNATIONAL 2024; 192:108997. [PMID: 39293234 DOI: 10.1016/j.envint.2024.108997] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/17/2024] [Revised: 07/31/2024] [Accepted: 09/02/2024] [Indexed: 09/20/2024]
Abstract
Accurate air quality forecasting is crucial for public health, environmental monitoring and protection, and urban planning. However, existing methods fail to effectively utilize multi-scale information, both spatially and temporally. There is a lack of integration between individual monitoring stations and city-wide scales. Temporally, the periodic nature of air quality variations is often overlooked or inadequately considered. To overcome these limitations, we conduct a thorough analysis of the data and tasks, integrating spatio-temporal multi-scale domain knowledge. We present a novel Multi-spatial Multi-temporal air quality forecasting method based on Graph Convolutional Networks and Gated Recurrent Units (M2G2), bridging the gap in air quality forecasting across spatial and temporal scales. The proposed framework consists of two modules: Multi-scale Spatial GCN (MS-GCN) for spatial information fusion and Multi-scale Temporal GRU (MT-GRU) for temporal information integration. In the spatial dimension, the MS-GCN module employs a bidirectional learnable structure and a residual structure, enabling comprehensive information exchange between individual monitoring stations and the city-scale graph. Regarding the temporal dimension, the MT-GRU module adaptively combines information from different temporal scales through parallel hidden states. Leveraging meteorological indicators and four air quality indicators, we present comprehensive comparative analyses and ablation experiments, showcasing the higher accuracy of M2G2 in comparison to nine currently available advanced approaches across all aspects. The improvements of M2G2 over the second-best method on RMSE of 72-h future predictions are as follows: PM2.5: 6%∼10%; PM10: 5%∼7%; NO2: 5%∼16%; O3: 6%∼9%. Furthermore, we demonstrate the effectiveness of each module of M2G2 by ablation study. We conduct a sensitivity analysis of air quality and meteorological data, finding that the introduction of O3 adversely impacts the prediction accuracy of PM2.5.
Collapse
Affiliation(s)
- Yuxiao Hu
- Department of Building Environment and Energy Engineering, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong, China; Ningbo Institute of Digital Twin, Eastern Institute of Technology, Ningbo 315200, China
| | - Qian Li
- Ningbo Institute of Digital Twin, Eastern Institute of Technology, Ningbo 315200, China; School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Xiaodan Shi
- School of Business, Society and Technology, Mälardalens University, Västerås 72123, Sweden
| | - Jinyue Yan
- Department of Building Environment and Energy Engineering, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong, China
| | - Yuntian Chen
- Ningbo Institute of Digital Twin, Eastern Institute of Technology, Ningbo 315200, China
| |
Collapse
|
3
|
Folifack Signing VR, Mbarndouka Taamté J, Kountchou Noube M, Hamadou Yerima A, Azzopardi J, Tchuente Siaka YF, Saïdou. IoT-based monitoring system and air quality prediction using machine learning for a healthy environment in Cameroon. ENVIRONMENTAL MONITORING AND ASSESSMENT 2024; 196:621. [PMID: 38879702 DOI: 10.1007/s10661-024-12789-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Accepted: 06/06/2024] [Indexed: 07/11/2024]
Abstract
This paper is aimed at developing an air quality monitoring system using machine learning (ML), Internet of Things (IoT), and other elements to predict the level of particulate matter and gases in the air based on the air quality index (AQI). It is an air quality assessor and therefore a means of achieving the Sustainable Development Goals (SDGs), in particular, SDG 3.9 (substantial reduction of the health impacts of hazardous substances) and SDG 11.6 (reduction of negative impacts on cities and populations). AQI quantifies and informs the public about air pollutants and their adverse effects on public health. The proposed air quality monitoring device is low-cost and operates in real-time. It consists of a hardware unit that detects various pollutants to assess air quality as well as other airborne particles such as carbon dioxide (CO2), methane (CH4), volatile organic compounds (VOCs), nitrogen dioxide (NO2), carbon monoxide (CO), and particulate matter with an aerodynamic diameter of 2.5 microns or less (PM2.5). To predict air quality, the device was deployed from November 1, 2022, to February 4, 2023, in certain bauxite-rich areas of Adamawa and certain volcanic sites in western Cameroon. Therefore, machine learning algorithm models, namely, multiple linear regression (MLR), support vector regression (SVR), random forest regression (RFR), XGBoost (XGB), and K-nearest neighbors (KNN) were applied to analyze the collected concentrations and predict the future state of air quality. The performance of these models was evaluated using mean absolute error (MAE), coefficient of determination (R-square), and root mean square error (RMSE). The obtained data in this study show that these pollutants are present in selected localities albeit to different extents. Moreover, the AQI values obtained range from 10 to 530, with a mean of 132.380 ± 63.705, corresponding to moderate air quality state but may induce an adverse effect on sensitive members of the population. This study revealed that XGB regression performed better in air quality forecasting with the highest R-squared (test score of 0.9991 and train score of 0.9999) and lowest RMSE (test score of 1.5748 and train score of 0. 0073) and MAE (test score of 0.0872 and train score of 0.0020), while the KNN model had the worst prediction (lowest R-squared and highest RMSE and MAE). This embryonic work is a prototype for projects in Cameroon as measurements are underway for a national spread over a longer period of time.
Collapse
Affiliation(s)
- Vitrice Ruben Folifack Signing
- Research Centre for Nuclear Science and Technology, Institute of Geological and Mining Research, P.O. Box 4110, Yaoundé, Cameroon
| | - Jacob Mbarndouka Taamté
- Research Centre for Nuclear Science and Technology, Institute of Geological and Mining Research, P.O. Box 4110, Yaoundé, Cameroon
| | - Michaux Kountchou Noube
- Research Centre for Nuclear Science and Technology, Institute of Geological and Mining Research, P.O. Box 4110, Yaoundé, Cameroon
| | - Abba Hamadou Yerima
- Research Centre for Nuclear Science and Technology, Institute of Geological and Mining Research, P.O. Box 4110, Yaoundé, Cameroon
| | - Joel Azzopardi
- Department of Artificial Intelligence, Faculty of Information and Communication Technology, University of Malta, Msida, Malta
| | - Yvette Flore Tchuente Siaka
- Research Centre for Nuclear Science and Technology, Institute of Geological and Mining Research, P.O. Box 4110, Yaoundé, Cameroon.
| | - Saïdou
- Research Centre for Nuclear Science and Technology, Institute of Geological and Mining Research, P.O. Box 4110, Yaoundé, Cameroon
- Nuclear Physics Laboratory, Faculty of Science, University of Yaoundé I, P.O. Box 812, Yaoundé, Cameroon
| |
Collapse
|
4
|
Wu H, Yang T, Li H, Zhou Z. Air quality prediction model based on mRMR-RF feature selection and ISSA-LSTM. Sci Rep 2023; 13:12825. [PMID: 37550459 PMCID: PMC10406845 DOI: 10.1038/s41598-023-39838-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Accepted: 07/31/2023] [Indexed: 08/09/2023] Open
Abstract
Severe air pollution poses a significant threat to public safety and human health. Predicting future air quality conditions is crucial for implementing pollution control measures and guiding residents' activity choices. However, traditional single-module machine learning models suffer from long training times and low prediction accuracy. To improve the accuracy of air quality forecasting, this paper proposes a ISSA-LSTM model-based approach for predicting the air quality index (AQI). The model consists of three main components: random forest (RF) and mRMR, improved sparrow search algorithm (ISSA), and long short-term memory network (LSTM). Firstly, RF-mRMR is used to select the influential variables affecting AQI, thereby enhancing the model's performance. Next, ISSA algorithm is employed to optimize the hyperparameters of LSTM, further improving the model's performance. Finally, LSTM model is utilized to predict AQI concentrations. Through comparative experiments, it is demonstrated that the ISSA-LSTM model outperforms other models in terms of RMSE and R2, exhibiting higher prediction accuracy. The model's predictive performance is validated across different time steps, demonstrating minimal prediction errors. Therefore, the ISSA-LSTM model is a viable and effective approach for accurately predicting AQI.
Collapse
Affiliation(s)
- Huiyong Wu
- College of Science, Shenyang University of Chemical Technology, Shenyang, Liaoning, China
| | - Tongtong Yang
- College of Science, Shenyang University of Chemical Technology, Shenyang, Liaoning, China.
| | - Hongkun Li
- College of Science, Shenyang University of Chemical Technology, Shenyang, Liaoning, China
| | - Ziwei Zhou
- College of Science, Shenyang University of Chemical Technology, Shenyang, Liaoning, China
| |
Collapse
|
5
|
Xu J, Wang S, Ying N, Xiao X, Zhang J, Jin Z, Cheng Y, Zhang G. Dynamic graph neural network with adaptive edge attributes for air quality prediction: A case study in China. Heliyon 2023; 9:e17746. [PMID: 37456022 PMCID: PMC10345359 DOI: 10.1016/j.heliyon.2023.e17746] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2022] [Revised: 06/27/2023] [Accepted: 06/27/2023] [Indexed: 07/18/2023] Open
Abstract
Air quality prediction is a typical Spatiotemporal modeling problem, which always uses different components to handle spatial and temporal dependencies in complex systems separately. Previous models based on time series analysis and recurrent neural network (RNN) methods have only modeled time series while ignoring spatial information. Previous graph convolution neural networks (GCNs) based methods usually require providing spatial correlation graph structure of observation sites in advance. The correlations among these sites and their strengths are usually calculated using prior information. However, due to the limitations of human cognition, limited prior information cannot reflect the real station-related structure or bring more effective information for accurate prediction. To this end, we propose a novel Dynamic Graph Neural Network with Adaptive Edge Attributes (DGN-AEA) on the message passing network, which generates the adaptive bidirected dynamic graph by learning the edge attributes as model parameters. Unlike prior information to establish edges, our method can obtain adaptive edge information through end-to-end training without any prior information. Thus reducing the complexity of the problem. Besides, the hidden structural information between the stations can be obtained as model by-products, which can help make some subsequent decision-making analyses. Experimental results show that our model received state-of-the-art performance than other baselines.
Collapse
Affiliation(s)
- Jing Xu
- School of Systems Science, Beijing Normal University, Beijing, 100875, China
| | - Shuo Wang
- School of Systems Science, Beijing Normal University, Beijing, 100875, China
- Information Technology and Electrical Engineering, ETH Zurich, Zurich, 8092, Switzerland
- Swarma Research, Beijing, China
| | - Na Ying
- Chinese Research Academy of Environmental Sciences, Beijing, 100085, China
| | - Xiao Xiao
- School of Telecommunications Engineering, Xidian University, Xi'an, 710071, Shaanxi, China
| | - Jiang Zhang
- School of Systems Science, Beijing Normal University, Beijing, 100875, China
- Swarma Research, Beijing, China
| | - Zhiling Jin
- School of Telecommunications Engineering, Xidian University, Xi'an, 710071, Shaanxi, China
| | - Yun Cheng
- Information Technology and Electrical Engineering, ETH Zurich, Zurich, 8092, Switzerland
| | - Gangfeng Zhang
- State Key Laboratory of Earth Surface Processes and Resource Ecology, Beijing Normal University, Beijing, 100875, China
- Faculty of Geophysical Science, Beijing Normal University, Beijing, 100875, China
| |
Collapse
|
6
|
Spatiotemporal distribution, trend, forecast, and influencing factors of transboundary and local air pollutants in Nagasaki Prefecture, Japan. Sci Rep 2023; 13:851. [PMID: 36646784 PMCID: PMC9842204 DOI: 10.1038/s41598-023-27936-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2022] [Accepted: 01/10/2023] [Indexed: 01/18/2023] Open
Abstract
The study of PM2.5 and NO2 has been emphasized in recent years due to their adverse effects on public health. To better understand these pollutants, many studies have researched the spatiotemporal distribution, trend, forecast, or influencing factors of these pollutants. However, rarely studies have combined these to generate a more holistic understanding that can be used to assess air pollution and implement more effective strategies. In this study, we analyze the spatiotemporal distribution, trend, forecast, and factors influencing PM2.5 and NO2 in Nagasaki Prefecture by using ordinary kriging, pearson's correlation, random forest, mann-kendall, auto-regressive integrated moving average and error trend and seasonal models. The results indicated that PM2.5, due to its long-range transport properties, has a more substantial spatiotemporal variation and affects larger areas in comparison to NO2, which is a local pollutant. Despite tri-national efforts, local regulations and legislation have been effective in reducing NO2 concentration but less effective in reducing PM2.5. This multi-method approach provides a holistic understanding of PM2.5 and NO2 pollution in Nagasaki prefecture, which can aid in implementing more effective pollution management strategies. It can also be implemented in other regions where studies have only focused on one of the aspects of air pollution and where a holistic understanding of air pollution is lacking.
Collapse
|
7
|
Fan K, Dhammapala R, Harrington K, Lamb B, Lee Y. Machine learning-based ozone and PM2.5 forecasting: Application to multiple AQS sites in the Pacific Northwest. Front Big Data 2023; 6:1124148. [PMID: 36910164 PMCID: PMC9999009 DOI: 10.3389/fdata.2023.1124148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Accepted: 02/06/2023] [Indexed: 03/14/2023] Open
Abstract
Air quality in the Pacific Northwest (PNW) of the U.S has generally been good in recent years, but unhealthy events were observed due to wildfires in summer or wood burning in winter. The current air quality forecasting system, which uses chemical transport models (CTMs), has had difficulty forecasting these unhealthy air quality events in the PNW. We developed a machine learning (ML) based forecasting system, which consists of two components, ML1 (random forecast classifiers and multiple linear regression models) and ML2 (two-phase random forest regression model). Our previous study showed that the ML system provides reliable forecasts of O3 at a single monitoring site in Kennewick, WA. In this paper, we expand the ML forecasting system to predict both O3 in the wildfire season and PM2.5 in wildfire and cold seasons at all available monitoring sites in the PNW during 2017-2020, and evaluate our ML forecasts against the existing operational CTM-based forecasts. For O3, both ML1 and ML2 are used to achieve the best forecasts, which was the case in our previous study: ML2 performs better overall (R2 = 0.79), especially for low-O3 events, while ML1 correctly captures more high-O3 events. Compared to the CTM-based forecast, our O3 ML forecasts reduce the normalized mean bias (NMB) from 7.6 to 2.6% and normalized mean error (NME) from 18 to 12% when evaluating against the observation. For PM2.5, ML2 performs the best and thus is used for the final forecasts. Compared to the CTM-based PM2.5, ML2 clearly improves PM2.5 forecasts for both wildfire season (May to September) and cold season (November to February): ML2 reduces NMB (-27 to 7.9% for wildfire season; 3.4 to 2.2% for cold season) and NME (59 to 41% for wildfires season; 67 to 28% for cold season) significantly and captures more high-PM2.5 events correctly. Our ML air quality forecast system requires fewer computing resources and fewer input datasets, yet it provides more reliable forecasts than (if not, comparable to) the CTM-based forecast. It demonstrates that our ML system is a low-cost, reliable air quality forecasting system that can support regional/local air quality management.
Collapse
Affiliation(s)
- Kai Fan
- Center for Advanced Systems Understanding, Görlitz, Germany.,Helmholtz-Zentrum Dresden Rossendorf, Dresden, Germany.,Laboratory for Atmospheric Research, Department of Civil and Environmental Engineering, Washington State University, Pullman, WA, United States
| | - Ranil Dhammapala
- South Coast Air Quality Management District, Diamond Bar, CA, United States
| | | | - Brian Lamb
- Laboratory for Atmospheric Research, Department of Civil and Environmental Engineering, Washington State University, Pullman, WA, United States
| | - Yunha Lee
- Center for Advanced Systems Understanding, Görlitz, Germany.,Helmholtz-Zentrum Dresden Rossendorf, Dresden, Germany.,Laboratory for Atmospheric Research, Department of Civil and Environmental Engineering, Washington State University, Pullman, WA, United States
| |
Collapse
|
8
|
Dutta D, Pal SK. Prediction and assessment of the impact of COVID-19 lockdown on air quality over Kolkata: a deep transfer learning approach. ENVIRONMENTAL MONITORING AND ASSESSMENT 2022; 195:223. [PMID: 36544059 PMCID: PMC9771789 DOI: 10.1007/s10661-022-10761-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/25/2022] [Accepted: 11/12/2022] [Indexed: 06/17/2023]
Abstract
The present study focuses on the prediction and assessment of the impact of lockdown because of coronavirus pandemic on the air quality during three different phases, viz., normal periods (1 January 2018-23 March 2020), complete lockdown (24 March 2020-31 May 2020), and partial lockdown (1 June 2020-30 September 2020). We identify the most important air pollutants influencing the air quality of Kolkata during three different periods using Random Forest, a tree-based machine learning (ML) algorithm. It is found that the ambient air quality of Kolkata is mainly affected with the aid of particulate matter or PM (PM10 and PM2.5). However, the effect of the lockdown is most prominent on PM2.5 which spreads in the air of Kolkata due to diesel-driven vehicles, domestic and commercial combustion activities, road dust, and open burning. To predict urban PM2.5 and PM10 concentrations 24 h in advance, we use a deep learning (DL) model, namely, stacked-bidirectional long short-term memory (stacked-BDLSTM). The model is trained during the normal periods, and it shows the superiority over some supervised ML models, like support vector machine, K-nearest neighbor classifier, multilayer perceptron, long short-term memory, and statistical time series forecasting model autoregressive integrated moving average. This pre-trained stacked-BDLSTM is applied to predict the concentrations of PM2.5 and PM10 during the pandemic situation of two cases, viz., complete lockdown and partial lockdown using a deep model-based transfer learning (TL) approach (TLS-BDLSTM). Transfer learning aims to utilize the information gained from one problem to improve the predictive performance of a learning model for a different but related problem. Our work helps to demonstrate how TL is useful when there is a scarcity of data during the COVID-19 pandemic regarding the drastic change in concentration of pollutants. The results reveal the best prediction performance of TLS-BDLSTM with a lead time of 24 h as compared to some well-known traditional ML and statistical models and the pre-trained stacked-BDLSTM. The prediction is then validated using the real-time data obtained during the complete lockdown due to COVID second wave (16 May-15 June 2021) with different time steps, e.g., 24 h, 48 h, 72 h, and 96-120 h. TLS-BDLSTM involving transfer learning is seen to outperform the said comparing methods in modeling the long-term temporal dependency of multivariate time series data and boost the forecast efficiency not only in single step, but also in multiple steps. The proposed methodologies are effective, consistent, and can be used by operational organizations to utilize in monitoring and management of air quality.
Collapse
Affiliation(s)
- Debashree Dutta
- Center for Soft Computing Research, Indian Statistical Institute, Kolkata, 700108 India
| | - Sankar K. Pal
- Center for Soft Computing Research, Indian Statistical Institute, Kolkata, 700108 India
| |
Collapse
|
9
|
Guo Q, Ren M, Wu S, Sun Y, Wang J, Wang Q, Ma Y, Song X, Chen Y. Applications of artificial intelligence in the field of air pollution: A bibliometric analysis. Front Public Health 2022; 10:933665. [PMID: 36159306 PMCID: PMC9490423 DOI: 10.3389/fpubh.2022.933665] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2022] [Accepted: 08/11/2022] [Indexed: 01/25/2023] Open
Abstract
Background Artificial intelligence (AI) has become widely used in a variety of fields, including disease prediction, environmental monitoring, and pollutant prediction. In recent years, there has also been an increase in the volume of research into the application of AI to air pollution. This study aims to explore the latest trends in the application of AI in the field of air pollution. Methods All literature on the application of AI to air pollution was searched from the Web of Science database. CiteSpace 5.8.R1 was used to analyze countries/regions, institutions, authors, keywords and references cited, and to reveal hot spots and frontiers of AI in atmospheric pollution. Results Beginning in 1994, publications on AI in air pollution have increased in number, with a surge in research since 2017. The leading country and institution were China (N = 524) and the Chinese Academy of Sciences (N = 58), followed by the United States (N = 455) and Tsinghua University (N = 33), respectively. In addition, the United States (0.24) and the England (0.27) showed a high degree of centrality. Most of the identified articles were published in journals related to environmental science; the most cited journal was Atmospheric Environment, which reached nearly 1,000 citations. There were few collaborations among authors, institutions and countries. The hot topics were machine learning, air pollution and deep learning. The majority of the researchers concentrated on air pollutant concentration prediction, particularly the combined use of AI and environmental science methods, low-cost air quality sensors, indoor air quality, and thermal comfort. Conclusion Researches in the field of AI and air pollution are expanding rapidly in recent years. The majority of scholars are from China and the United States, and the Chinese Academy of Sciences is the dominant research institution. The United States and the England contribute greatly to the development of the cooperation network. Cooperation among research institutions appears to be suboptimal, and strengthening cooperation could greatly benefit this field of research. The prediction of air pollutant concentrations, particularly PM2.5, low-cost air quality sensors, and thermal comfort are the current research hotspot.
Collapse
Affiliation(s)
- Qiangqiang Guo
- School of Public Health, Lanzhou University, Lanzhou, China
| | - Mengjuan Ren
- School of Public Health, Lanzhou University, Lanzhou, China
| | - Shouyuan Wu
- School of Public Health, Lanzhou University, Lanzhou, China
| | - Yajia Sun
- School of Public Health, Lanzhou University, Lanzhou, China
| | - Jianjian Wang
- School of Public Health, Lanzhou University, Lanzhou, China
| | - Qi Wang
- Department of Health Research Methods, Evidence and Impact, Faculty of Health Sciences, McMaster University, Hamilton, ON, Canada,McMaster Health Forum, McMaster University, Hamilton, ON, Canada
| | - Yanfang Ma
- School of Chinese Medicine, Hong Kong Baptist University, Kowloon Tong, Hong Kong SAR, China
| | - Xuping Song
- School of Public Health, Lanzhou University, Lanzhou, China,Research Unit of Evidence-Based Evaluation and Guidelines, Chinese Academy of Medical Sciences (2021RU017), School of Basic Medical Sciences, Lanzhou University, Lanzhou, China,Lanzhou University Institute of Health Data Science, Lanzhou, China,World Health Organization Collaborating Center for Guideline Implementation and Knowledge Translation, Lanzhou, China,*Correspondence: Xuping Song
| | - Yaolong Chen
- School of Public Health, Lanzhou University, Lanzhou, China,Research Unit of Evidence-Based Evaluation and Guidelines, Chinese Academy of Medical Sciences (2021RU017), School of Basic Medical Sciences, Lanzhou University, Lanzhou, China,Lanzhou University Institute of Health Data Science, Lanzhou, China,World Health Organization Collaborating Center for Guideline Implementation and Knowledge Translation, Lanzhou, China,Yaolong Chen
| |
Collapse
|
10
|
Chen L, Liu X, Zeng C, He X, Chen F, Zhu B. Temperature Prediction of Seasonal Frozen Subgrades Based on CEEMDAN-LSTM Hybrid Model. SENSORS (BASEL, SWITZERLAND) 2022; 22:5742. [PMID: 35957299 PMCID: PMC9370898 DOI: 10.3390/s22155742] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/20/2022] [Revised: 07/28/2022] [Accepted: 07/29/2022] [Indexed: 05/27/2023]
Abstract
Improving the temperature prediction accuracy for subgrades in seasonally frozen regions will greatly help improve the understanding of subgrades' thermal states. Due to the nonlinearity and non-stationarity of the temperature time series of subgrades, it is difficult for a single general neural network to accurately capture these two characteristics. Many hybrid models have been proposed to more accurately forecast the temperature time series. Among these hybrid models, the CEEMDAN-LSTM model is promising, thanks to the advantages of the long short-term memory (LSTM) artificial neural network, which is good at handling complex time series data, and its combination with the broad applicability of the complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) in the field of signal decomposition. In this study, by performing empirical mode decomposition (EMD), ensemble empirical mode decomposition (EEMD), and CEEMDAN on temperature time series, respectively, a hybrid dataset is formed with the corresponding time series of volumetric water content and frost heave, and finally, the CEEMDAN-LSTM model is created for prediction purposes. The results of the performance comparisons between multiple models show that the CEEMDAN-LSTM model has the best prediction performance compared to other decomposed LSTM models because the composition of the hybrid dataset improves predictive ability, and thus, it can better handle the nonlinearity and non-stationarity of the temperature time series data.
Collapse
Affiliation(s)
- Liyue Chen
- Badong National Observation and Research Station of Geohazards, China University of Geosciences, Wuhan 430074, China; (L.C.); (B.Z.)
- China Communications Construction Company Second Highway Consultants Co., Ltd., Wuhan 430056, China; (C.Z.); (X.H.); (F.C.)
| | - Xiao Liu
- Badong National Observation and Research Station of Geohazards, China University of Geosciences, Wuhan 430074, China; (L.C.); (B.Z.)
| | - Chao Zeng
- China Communications Construction Company Second Highway Consultants Co., Ltd., Wuhan 430056, China; (C.Z.); (X.H.); (F.C.)
| | - Xianzhi He
- China Communications Construction Company Second Highway Consultants Co., Ltd., Wuhan 430056, China; (C.Z.); (X.H.); (F.C.)
| | - Fengguang Chen
- China Communications Construction Company Second Highway Consultants Co., Ltd., Wuhan 430056, China; (C.Z.); (X.H.); (F.C.)
| | - Baoshan Zhu
- Badong National Observation and Research Station of Geohazards, China University of Geosciences, Wuhan 430074, China; (L.C.); (B.Z.)
| |
Collapse
|
11
|
An Improved Air Quality Index Machine Learning-Based Forecasting with Multivariate Data Imputation Approach. ATMOSPHERE 2022. [DOI: 10.3390/atmos13071144] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Accurate, timely air quality index (AQI) forecasting helps industries in selecting the most suitable air pollution control measures and the public in reducing harmful exposure to pollution. This article proposes a comprehensive method to forecast AQIs. Initially, the work focused on predicting hourly ambient concentrations of PM2.5 and PM10 using artificial neural networks. Once the method was developed, the work was extended to the prediction of other criteria pollutants, i.e., O3, SO2, NO2, and CO, which fed into the process of estimating AQI. The prediction of the AQI not only requires the selection of a robust forecasting model, it also heavily relies on a sequence of pre-processing steps to select predictors and handle different issues in data, including gaps. The presented method dealt with this by imputing missing entries using missForest, a machine learning-based imputation technique which employed the random forest (RF) algorithm. Unlike the usual practice of using RF at the final forecasting stage, we utilized RF at the data pre-processing stage, i.e., missing data imputation and feature selection, and we obtained promising results. The effectiveness of this imputation method was examined against a linear imputation method for the six criteria pollutants and the AQI. The proposed approach was validated against ambient air quality observations for Al-Jahra, a major city in Kuwait. Results obtained showed that models trained using missForest-imputed data could generalize AQI forecasting and with a prediction accuracy of 92.41% when tested on new unseen data, which is better than earlier findings.
Collapse
|
12
|
Raju L, Gandhimathi R, Mathew A, Ramesh S. Spatio-temporal modelling of particulate matter concentrations using satellite derived aerosol optical depth over coastal region of Chennai in India. ECOL INFORM 2022. [DOI: 10.1016/j.ecoinf.2022.101681] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
13
|
Xiao X, Jin Z, Wang S, Xu J, Peng Z, Wang R, Shao W, Hui Y. A dual-path dynamic directed graph convolutional network for air quality prediction. THE SCIENCE OF THE TOTAL ENVIRONMENT 2022; 827:154298. [PMID: 35271925 DOI: 10.1016/j.scitotenv.2022.154298] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/31/2021] [Revised: 02/28/2022] [Accepted: 02/28/2022] [Indexed: 06/14/2023]
Abstract
Accurate air quality prediction can help cope with air pollution and improve the life quality. With the development of the deployments of low-cost air quality sensors, increasing data related to air quality has provided chances to find out more accurate prediction methods. Air quality is affected by many external factors such as the position, wind, meteorological information, and so on. Meanwhile, these factors are spatio-temporal dynamic and there are many dynamic contextual relationships between them. Many methods for air quality prediction do not consider these complex spatio-temporal correlations and dynamic contextual relationships. In this paper, we propose a dual-path dynamic directed graph convolutional network (DP-DDGCN) for air quality prediction. We first create a dual-path transposed dynamic directed graph according to static distance relationships of stations and the dynamic relationships generated by wind speed and directions. Then based on the dual-path dynamic directed graph, we can capture the dynamic spatial dependencies more comprehensively. After that we apply gated recurrent units (GRUs) and add the future meteorological features, to extract the complex temporal dependencies of historical air quality data. Using dual-path dynamic directed graph blocks and the GRUs, we finally construct a dynamic spatio-temporal gated recurrent block to capture the dynamic spatio-temporal contextual correlations. Based on real-world datasets, which record a large amount of PM2.5 concentration data, we compare the proposed model with the benchmark models. The experimental results show that our proposed model has the best performance in predicting the PM2.5 concentrations.
Collapse
Affiliation(s)
- Xiao Xiao
- School of Telecommunications Engineering, Xidian University, Xi'an 710071, Shaanxi, China.
| | - Zhiling Jin
- School of Telecommunications Engineering, Xidian University, Xi'an 710071, Shaanxi, China.
| | - Shuo Wang
- School of Systems Science, Beijing Normal University, Beijing, 100875, China.
| | - Jing Xu
- School of Systems Science, Beijing Normal University, Beijing, 100875, China
| | - Ziyan Peng
- School of Telecommunications Engineering, Xidian University, Xi'an 710071, Shaanxi, China.
| | - Rui Wang
- School of Electronic Information, Sichuan University, Chengdu 610065, Sichuan, China
| | - Wei Shao
- School of Computing Technologies, RMIT University, Melbourne, Victoria 3000, Australia.
| | - Yilong Hui
- School of Telecommunications Engineering, Xidian University, Xi'an 710071, Shaanxi, China; The State Key Laboratory of Integrated Services Networks, Xidian University, Xi'an 710071, Shaanxi, China.
| |
Collapse
|
14
|
Liu B, Zhang Y. Calibration of miniature air quality detector monitoring data with PCA-RVM-NAR combination model. Sci Rep 2022; 12:9333. [PMID: 35661143 PMCID: PMC9167304 DOI: 10.1038/s41598-022-13531-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Accepted: 05/25/2022] [Indexed: 11/09/2022] Open
Abstract
The development of miniature air quality detectors makes it possible for humans to monitor air quality in real time and grid. However, the accuracy of measuring pollutants by miniature air quality detectors needs to be improved. In this paper, the PCA-RVM-NAR combined model is proposed to calibrate the measurement accuracy of the miniature air quality detector. First, correlation analysis is used to find out the main factors affecting pollutant concentrations. Second, principal component analysis is used to reduce the dimensionality of these main factors and extract their main information. Thirdly, taking the extracted principal components as independent variables and the observed values of pollutant concentrations as dependent variables, a PCA-RVM model is established by the relevance vector machine. Finally, the nonlinear autoregressive neural network is used to correct the error and finally complete the establishment of the PCA-RVM-NAR model. Root mean square error, goodness of fit, mean absolute error and relative mean absolute percent error are used to compare the calibration effect of PCA-RVM-NAR model and other commonly used models such as multiple linear regression model, support vector machine, multilayer perceptron neural network and nonlinear autoregressive models with exogenous input. The results show that, no matter which pollutant, the PCA-RVM-NAR model achieves better calibration results than other models in the four indicators. Using this model to correct the data of the miniature air quality detector can improve its accuracy by 77.8-93.9%.
Collapse
Affiliation(s)
- Bing Liu
- Public Foundational Courses Department, Nanjing Vocational University of Industry Technology, Nanjing, 210023, China.
| | - Yirui Zhang
- School of Intelligent Manufacturing, Sanmenxia Polytechnic, Sanmenxia, 472000, China
| |
Collapse
|
15
|
Bharathi PD, Narayanan VA, Sivakumar PB. Fog computing enabled air quality monitoring and prediction leveraging deep learning in IoT. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2022. [DOI: 10.3233/jifs-212713] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
With the rapid industrialization and urbanization worldwide, air quality levels are deteriorating at an unprecedented rate and posing a substantial threat to humans and the environment. This brings the concern to effectively monitor and forecast air quality levels in real-time. Conventional air quality monitoring stations are built based on centralized architectures involving high latency, communication technologies demanding high power, sensors involving high costs and decision making with moderate accuracy. To address the limitations of the existing systems, we propose a smart and distinct Air Quality Monitoring and Forecasting system embracing Fog Computing with IoT and Deep Learning (DL). The system is a three-layered architecture with the Sensing layer first, Fog Computing layer in between, and Cloud Computing layer at the end. Fog Computing is a powerful new generation paradigm that brings storage, computation, and networking at the edge of the IoT network and reduce network latency. A DL based BiLSTM (Bidirectional Long Short-Term Memory) model is deployed in the Fog Computing layer. The proposed system aims at real-time monitoring and accurate air quality forecasting to support decision making and aid timely prevention and control of pollutant emissions by alerting the stakeholders when a dangerous Air Quality Index (AQI) is expected. Experimental results show that the BiLSTM model has a better predictive performance considering the meteorological parameters than the baseline models in terms of MAE and RMSE. A proof of concept realizing the proposed system is elaborated in the paper.
Collapse
Affiliation(s)
- P. Divya Bharathi
- Department of Computer Science and Engineering, Amrita School of Engineering, Coimbatore, Amrita VishwaVidyapeetham, India
| | - V. Anantha Narayanan
- Department of Computer Science and Engineering, Amrita School of Engineering, Coimbatore, Amrita VishwaVidyapeetham, India
| | - P. Bagavathi Sivakumar
- Department of Computer Science and Engineering, Amrita School of Engineering, Coimbatore, Amrita VishwaVidyapeetham, India
| |
Collapse
|
16
|
|
17
|
Fan K, Dhammapala R, Harrington K, Lamastro R, Lamb B, Lee Y. Development of a Machine Learning Approach for Local-Scale Ozone Forecasting: Application to Kennewick, WA. Front Big Data 2022; 5:781309. [PMID: 35237751 PMCID: PMC8883518 DOI: 10.3389/fdata.2022.781309] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Accepted: 01/19/2022] [Indexed: 11/13/2022] Open
Abstract
Chemical transport models (CTMs) are widely used for air quality forecasts, but these models require large computational resources and often suffer from a systematic bias that leads to missed poor air pollution events. For example, a CTM-based operational forecasting system for air quality over the Pacific Northwest, called AIRPACT, uses over 100 processors for several hours to provide 48-h forecasts daily, but struggles to capture unhealthy O3 episodes during the summer and early fall, especially over Kennewick, WA. This research developed machine learning (ML) based O3 forecasts for Kennewick, WA to demonstrate an improved forecast capability. We used the 2017–2020 simulated meteorology and O3 observation data from Kennewick as training datasets. The meteorology datasets are from the Weather Research and Forecasting (WRF) meteorological model forecasts produced daily by the University of Washington. Our ozone forecasting system consists of two ML models, ML1 and ML2, to improve predictability: ML1 uses the random forest (RF) classifier and multiple linear regression (MLR) models, and ML2 uses a two-phase RF regression model with best-fit weighting factors. To avoid overfitting, we evaluate the ML forecasting system with the 10-time, 10-fold, and walk-forward cross-validation analysis. Compared to AIRPACT, ML1 improved forecast skill for high-O3 events and captured 5 out of 10 unhealthy O3 events, while AIRPACT and ML2 missed all the unhealthy events. ML2 showed better forecast skill for less elevated-O3 events. Based on this result, we set up our ML modeling framework to use ML1 for high-O3 events and ML2 for less elevated O3 events. Since May 2019, the ML modeling framework has been used to produce daily 72-h O3 forecasts and has provided forecasts via the web for clean air agency and public use: http://ozonematters.com/. Compared to the testing period, the operational forecasting period has not had unhealthy O3 events. Nevertheless, the ML modeling framework demonstrated a reliable forecasting capability at a selected location with much less computational resources. The ML system uses a single processor for minutes compared to the CTM-based forecasting system using more than 100 processors for hours.
Collapse
Affiliation(s)
- Kai Fan
- Center for Advanced Systems Understanding, Görlitz, Germany
- Helmholtz-Zentrum Dresden Rossendorf, Dresden, Germany
- Laboratory for Atmospheric Research, Department of Civil and Environmental Engineering, Washington State University, Pullman, WA, United States
| | - Ranil Dhammapala
- Washington State Department of Ecology, Olympia, WA, United States
| | | | - Ryan Lamastro
- Environmental Geochemical Science, School of Science and Engineering, State University of New York, New Paltz, NY, United States
| | - Brian Lamb
- Laboratory for Atmospheric Research, Department of Civil and Environmental Engineering, Washington State University, Pullman, WA, United States
| | - Yunha Lee
- Center for Advanced Systems Understanding, Görlitz, Germany
- Helmholtz-Zentrum Dresden Rossendorf, Dresden, Germany
- Laboratory for Atmospheric Research, Department of Civil and Environmental Engineering, Washington State University, Pullman, WA, United States
- *Correspondence: Yunha Lee
| |
Collapse
|
18
|
Chelani AB, Gautam S. The influence of meteorological variables and lockdowns on COVID-19 cases in urban agglomerations of Indian cities. STOCHASTIC ENVIRONMENTAL RESEARCH AND RISK ASSESSMENT : RESEARCH JOURNAL 2022; 36:2949-2960. [PMID: 35095340 PMCID: PMC8787448 DOI: 10.1007/s00477-021-02160-4] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 12/15/2021] [Indexed: 05/04/2023]
Abstract
Coronavirus has been identified as one of the deadliest diseases and the WHO has declared it a pandemic and a global health crisis. It has become a massive challenge for humanity. India is also facing its fierceness as it is highly infectious and mutating at a rapid rate. To control its spread, many interventions have been applied in India since the first reported case on January 30, 2020. Several studies have been conducted to assess the impact of climatic and weather conditions on its spread in the last one and half years span. As it is a well-established fact that temperature and humidity could trigger the onset of diseases such as influenza and respiratory disorders, the relationship of meteorological variables with the number of COVID-19 confirmed cases has been anticipated. The association of several meteorological variables has therefore been studied in the past with the number of COVID-19 confirmed cases. The conclusions in those studies are based on the data obtained at an early stage, and the inferences drawn based on those short time series studies may not be valid over a longer period. This study attempted to assess the influence of temperature, humidity, wind speed, dew point, previous day's number of deaths, and government interventions on the number of COVID-19 confirmed cases in 18 districts of India. It is also attempted to identify the important predictors of the number of confirmed COVID-19 cases in those districts. The random forest model and the hybrid model obtained by modelling the random forest model's residuals are used to predict the response variable. It is observed that meteorological variables are useful only to some extent when used with the data on the number of the previous day's deaths and lockdown information in predicting the number of COVID-19 cases. Partial lockdown is more important than complete or no lockdown in predicting the number of confirmed COVID-19 cases. Since the time span of the data in the study is reasonably large, the information is useful to policymakers in balancing the restriction activities and economic losses to individuals and the government.
Collapse
Affiliation(s)
- Asha B. Chelani
- Air Pollution Control Division, Nagpur, India
- Department of Civil Engineering, Karunya Institute of Technology and Sciences, Coimbatore, Tamil Nadu 641114 India
| | - Sneha Gautam
- National Environmental Engineering Research Institute (CSIR-NEERI), Nehru Marg, Nagpur, 440020 India
- Department of Civil Engineering, Karunya Institute of Technology and Sciences, Coimbatore, Tamil Nadu 641114 India
| |
Collapse
|
19
|
Bekkar A, Hssina B, Douzi S, Douzi K. Air-pollution prediction in smart city, deep learning approach. JOURNAL OF BIG DATA 2021; 8:161. [PMID: 34956819 PMCID: PMC8693596 DOI: 10.1186/s40537-021-00548-1] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/21/2021] [Accepted: 12/10/2021] [Indexed: 06/14/2023]
Abstract
Over the past few decades, due to human activities, industrialization, and urbanization, air pollution has become a life-threatening factor in many countries around the world. Among air pollutants, Particulate Matter with a diameter of less than 2.5 μ m ( P M 2.5 ) is a serious health problem. It causes various illnesses such as respiratory tract and cardiovascular diseases. Hence, it is necessary to accurately predict the P M 2.5 concentrations in order to prevent the citizens from the dangerous impact of air pollution beforehand. The variation of P M 2.5 depends on a variety of factors, such as meteorology and the concentration of other pollutants in urban areas. In this paper, we implemented a deep learning solution to predict the hourly forecast of P M 2.5 concentration in Beijing, China, based on CNN-LSTM, with a spatial-temporal feature by combining historical data of pollutants, meteorological data, and P M 2.5 concentration in the adjacent stations. We examined the difference in performances among Deep learning algorithms such as LSTM, Bi-LSTM, GRU, Bi-GRU, CNN, and a hybrid CNN-LSTM model. Experimental results indicate that our method "hybrid CNN-LSTM multivariate" enables more accurate predictions than all the listed traditional models and performs better in predictive performance.
Collapse
Affiliation(s)
| | - Badr Hssina
- FSTM, University Hassan II, Casablanca, Morocco
| | | | | |
Collapse
|
20
|
Jin G, Sha H, Feng Y, Cheng Q, Huang J. GSEN: An ensemble deep learning benchmark model for urban hotspots spatiotemporal prediction. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2021.05.008] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
21
|
Design of a Spark Big Data Framework for PM 2.5 Air Pollution Forecasting. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2021; 18:ijerph18137087. [PMID: 34281023 PMCID: PMC8296958 DOI: 10.3390/ijerph18137087] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/19/2021] [Revised: 06/29/2021] [Accepted: 06/30/2021] [Indexed: 12/05/2022]
Abstract
In recent years, with rapid economic development, air pollution has become extremely serious, causing many negative effects on health, environment and medical costs. PM2.5 is one of the main components of air pollution. Therefore, it is necessary to know the PM2.5 air quality in advance for health. Many studies on air quality are based on the government’s official air quality monitoring stations, which cannot be widely deployed due to high cost constraints. Furthermore, the update frequency of government monitoring stations is once an hour, and it is hard to capture short-term PM2.5 concentration peaks with little warning. Nevertheless, dealing with short-term data with many stations, the volume of data is huge and is calculated, analyzed and predicted in a complex way. This alleviates the high computational requirements of the original predictor, thus making Spark suitable for the considered problem. This study proposes a PM2.5 instant prediction architecture based on the Spark big data framework to handle the huge data from the LASS community. The Spark big data framework proposed in this study is divided into three modules. It collects real time PM2.5 data and performs ensemble learning through three machine learning algorithms (Linear Regression, Random Forest, Gradient Boosting Decision Tree) to predict the PM2.5 concentration value in the next 30 to 180 min with accompanying visualization graph. The experimental results show that our proposed Spark big data ensemble prediction model in next 30-min prediction has the best performance (R2 up to 0.96), and the ensemble model has better performance than any single machine learning model. Taiwan has been suffering from a situation of relatively poor air pollution quality for a long time. Air pollutant monitoring data from LASS community can provide a wide broader monitoring, however the data is large and difficult to integrate or analyze. The proposed Spark big data framework system can provide short-term PM2.5 forecasts and help the decision-maker to take proper action immediately.
Collapse
|
22
|
Wang X, Chai Y, Li H, Wang W, Sun W. Graph Convolutional Network-based Model for Incident-related Congestion Prediction: A Case Study of Shanghai Expressways. ACM TRANSACTIONS ON MANAGEMENT INFORMATION SYSTEMS 2021. [DOI: 10.1145/3451356] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
Traffic congestion has become a significant obstacle to the development of mega cities in China. Although local governments have used many resources in constructing road infrastructure, it is still insufficient for the increasing traffic demands. As a first step toward optimizing real-time traffic control, this study uses Shanghai Expressways as a case study to predict incident-related congestions. Our study proposes a graph convolutional network-based model to identify correlations in multi-dimensional sensor-detected data, while simultaneously taking into account environmental, spatiotemporal, and network features in predicting traffic conditions immediately after a traffic incident. The average accuracy, average AUC, and average F-1 score of the predictive model are 92.78%, 95.98%, and 88.78%, respectively, on small-scale ground-truth data. Furthermore, we improve the predictive model’s performance using semi-supervised learning by including more unlabeled data instances. As a result, the accuracy, AUC, and F-1 score of the model increase by 2.69%, 1.25%, and 4.72%, respectively. The findings of this article have important implications that can be used to improve the management and development of Expressways in Shanghai, as well as other metropolitan areas in China.
Collapse
Affiliation(s)
- Xi Wang
- School of Information, Central University of Finance and Economics, Beijing, P.R.China
| | - Yibo Chai
- School of Information, Central University of Finance and Economics, Beijing, P.R.China
| | - Hui Li
- School of Information, Central University of Finance and Economics, Beijing, P.R.China
| | - Wenbin Wang
- College of Business, Shanghai University of Finance and Economics, Shanghai, P.R.China
| | - Weishan Sun
- Shanghai Municipal Traffic Command Center, Shanghai, P.R.China
| |
Collapse
|
23
|
Zhu J, Deng F, Zhao J, Zheng H. Attention-based parallel networks (APNet) for PM 2.5 spatiotemporal prediction. THE SCIENCE OF THE TOTAL ENVIRONMENT 2021; 769:145082. [PMID: 33485205 DOI: 10.1016/j.scitotenv.2021.145082] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/11/2020] [Revised: 12/16/2020] [Accepted: 01/06/2021] [Indexed: 06/12/2023]
Abstract
Urban particulate matter forecast is an important part of air pollution early warning and control management, especially the forecast of fine particulate matter (PM2.5). However, the existing PM2.5 concentration prediction methods cannot effectively capture the complex nonlinearity of PM2.5 concentration, and most of them cannot accurately simulate the temporal and spatial dependence of PM2.5 concentration at the same time. In this paper, we propose an attention-based parallel network (APNet), which can extract short-term and long-term temporal features simultaneously based on the attention-based CNN-LSTM multilayer structure to predict PM2.5 concentration in the next 72 h. Firstly, the Maximum Information Coefficient (MIC) is designed for spatiotemporal correlation analysis, fully considering the linearity, non-linearity and non-functionality between the data of each monitoring station. The potential inherent features of the input data are effectively extracted through the convolutional neural network (CNN). Then, an optimized long short-term memroy (LSTM) network captures the short-term mutations of the time series. An attention mechanism is further designed for the proposed model, which automatically assigns different weights to different feature states at different time stages to distinguish their importance, and can achieve precise temporal and spatial interpretability. In order to further explore the long-term time features, we propose a Bi-LSTM parallel module to extract the periodic characteristics of PM2.5 concentration from both previous and posterior directions. Experimental results based on a real-world dataset indicates that the proposed model outperforms other existing state-of-the-art methods. Moreover, evaluations of recall (0.790), precision (0.848) (threshold: 151 μg/m3) for 72 h prediction also verify the feasibility of our proposed model. The methodology can be used for predicting other multivariate time series data in the future.
Collapse
Affiliation(s)
- Jiaqi Zhu
- School of Automation, Beijing Institute of Technology, Beijing 100081, China
| | - Fang Deng
- School of Automation, Beijing Institute of Technology, Beijing 100081, China; Beijing Institute of Technology Chongqing Innovation Center, Chongqing, 401120, China.
| | - Jiachen Zhao
- School of Automation, Beijing Institute of Technology, Beijing 100081, China
| | - Hao Zheng
- School of Automation, Beijing Institute of Technology, Beijing 100081, China
| |
Collapse
|
24
|
Abstract
Internet of Things (IoT) is a system that integrates different devices and technologies, removing the necessity of human intervention. This enables the capacity of having smart (or smarter) cities around the world. By hosting different technologies and allowing interactions between them, the internet of things has spearheaded the development of smart city systems for sustainable living, increased comfort and productivity for citizens. The IoT for Smart Cities has many different domains and draws upon various underlying systems for its operation. In this paper, we provide a holistic coverage of the Internet of Things in Smart Cities. We start by discussing the fundamental components that make up the IoT based Smart City landscape followed by the technologies that enable these domains to exist in terms of architectures utilized, networking technologies used as well as the Artificial Algorithms deployed in IoT based Smart City systems. This is then followed up by a review of the most prevalent practices and applications in various Smart City domains. Lastly, the challenges that deployment of IoT systems for smart cities encounter along with mitigation measures.
Collapse
|
25
|
Analysis and prediction of air quality in Nanjing from autumn 2018 to summer 2019 using PCR-SVR-ARMA combined model. Sci Rep 2021; 11:348. [PMID: 33431941 PMCID: PMC7801597 DOI: 10.1038/s41598-020-79462-0] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2020] [Accepted: 12/08/2020] [Indexed: 12/31/2022] Open
Abstract
In order to correct the monitoring data of the miniature air quality detector, an air quality prediction model fusing Principal Component Regression (PCR), Support Vector Regression (SVR) machine, and Autoregressive Moving Average (ARMA) model was proposed to improve the prediction accuracy of the six types of pollutants in the air. First, the main information of factors affecting air quality is extracted by principal component analysis, and then principal component regression is used to give the predicted values of six types of pollutants. Second, the support vector regression machine is used to regress the predicted value of principal component regression and various influencing factors. Finally, the autoregressive moving average model is used to correct the residual items, and finally the predicted values of six types of pollutants are obtained. The experimental results showed that the proposed combination prediction model of PCR–SVR–ARMA had a better prediction effect than the artificial neural network, the standard support vector regression machine, the principal component regression, and PCR–SVR method. The Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and relative Mean Absolute Percent Error (MAPE) are used as evaluation indicators to evaluate the PCR–SVR–ARMA model. This model can increase the accuracy of self-built points by 72.6% to 93.2%, and the model has excellent prediction effects in the training set and detection set, indicating that the model has good generalization ability. This model can play an active role scientific arrangement and promotion of miniature air quality detectors and grid-based monitoring of the concentration of various pollutants.
Collapse
|
26
|
Abstract
Air, an essential natural resource, has been compromised in terms of quality by economic activities. Considerable research has been devoted to predicting instances of poor air quality, but most studies are limited by insufficient longitudinal data, making it difficult to account for seasonal and other factors. Several prediction models have been developed using an 11-year dataset collected by Taiwan’s Environmental Protection Administration (EPA). Machine learning methods, including adaptive boosting (AdaBoost), artificial neural network (ANN), random forest, stacking ensemble, and support vector machine (SVM), produce promising results for air quality index (AQI) level predictions. A series of experiments, using datasets for three different regions to obtain the best prediction performance from the stacking ensemble, AdaBoost, and random forest, found the stacking ensemble delivers consistently superior performance for R2 and RMSE, while AdaBoost provides best results for MAE.
Collapse
|
27
|
Lin YC, Shih HS, Lai CY, Tai JK. Investigating a Potential Map of PM 2.5 Air Pollution and Risk for Tourist Attractions in Hsinchu County, Taiwan. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2020; 17:E8691. [PMID: 33238515 PMCID: PMC7700626 DOI: 10.3390/ijerph17228691] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/28/2020] [Revised: 11/17/2020] [Accepted: 11/19/2020] [Indexed: 01/04/2023]
Abstract
In the past few years, human health risks caused by fine particulate matters (PM2.5) and other air pollutants have gradually received attention. According to the Disaster Prevention and Protection Act of Taiwan's Government enforced in 2017, "suspended particulate matter" has officially been acknowledged as a disaster-causing hazard. The long-term exposure to high concentrations of air pollutants negatively affects the health of citizens. Therefore, the precise determination of the spatial long-term distribution of hazardous high-level air pollutants can help protect the health and safety of residents. The analysis of spatial information of disaster potentials is an important measure for assessing the risks of possible hazards. However, the spatial disaster-potential characteristics of air pollution have not been comprehensively studied. In addition, the development of air pollution potential maps of various regions would provide valuable information. In this study, Hsinchu County was chosen as an example. In the spatial data analysis, historical PM2.5 concentration data from the Taiwan Environmental Protection Administration (TWEPA) were used to analyze and estimate spatially the air pollution risk potential of PM2.5 in Hsinchu based on a geographic information system (GIS)-based radial basis function (RBF) spatial interpolation method. The probability that PM2.5 concentrations exceed a standard value was analyzed with the exceedance probability method; in addition, the air pollution risk levels of tourist attractions in Hsinchu County were determined. The results show that the air pollution risk levels of the different seasons are quite different. The most severe air pollution levels usually occur in spring and winter, whereas summer exhibits the best air quality. Xinfeng and Hukou Townships have the highest potential for air pollution episodes in Hsinchu County (approximately 18%). Hukou Old Street, which is one of the most important tourist attractions, has a relatively high air pollution risk. The analysis results of this study can be directly applied to other countries worldwide to provide references for tourists, tourism resource management, and air quality management; in addition, the results provide important information on the long-term health risks for local residents in the study area.
Collapse
Affiliation(s)
- Yuan-Chien Lin
- Department of Civil Engineering, National Central University, Taoyuan 32001, Taiwan; (H.-S.S.); (C.-Y.L.); (J.-K.T.)
- Research Center for Hazard Mitigation and Prevention, National Central University, Taoyuan 32001, Taiwan
| | - Hua-San Shih
- Department of Civil Engineering, National Central University, Taoyuan 32001, Taiwan; (H.-S.S.); (C.-Y.L.); (J.-K.T.)
| | - Chun-Yeh Lai
- Department of Civil Engineering, National Central University, Taoyuan 32001, Taiwan; (H.-S.S.); (C.-Y.L.); (J.-K.T.)
| | - Jen-Kuo Tai
- Department of Civil Engineering, National Central University, Taoyuan 32001, Taiwan; (H.-S.S.); (C.-Y.L.); (J.-K.T.)
- Fire Bureau, Hsinchu County Government, Hsinchu County 30295, Taiwan
| |
Collapse
|
28
|
Su JG, Meng YY, Chen X, Molitor J, Yue D, Jerrett M. Predicting differential improvements in annual pollutant concentrations and exposures for regulatory policy assessment. ENVIRONMENT INTERNATIONAL 2020; 143:105942. [PMID: 32659530 DOI: 10.1016/j.envint.2020.105942] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/01/2020] [Revised: 06/28/2020] [Accepted: 06/30/2020] [Indexed: 05/22/2023]
Abstract
Over the past decade, researchers and policy-makers have become increasingly interested in regulatory and policy interventions to reduce air pollution concentrations and improve human health. Studies have typically relied on relatively sparse environmental monitoring data that lack the spatial resolution to assess small-area improvements in air quality and health. Few studies have integrated multiple types of measures of an air pollutant into one single modeling framework that combines spatially- and temporally-rich monitoring data. In this paper, we investigated the differential effects of California emissions reduction plan on reducing air pollution between those living in the goods movement corridors (GMC) that are within 500 m of major highways that serve as truck routes to those farther away or adjacent to routes that prohibit trucks. A mixed effects Deletion/Substitution/Addition (D/S/A) machine learning algorithm was developed to model annual pollutant concentrations of nitrogen dioxide (NO2) by taking repeated measures into consideration and by integrating multiple types of NO2 measurements, including those through government regulatory and research-oriented saturation monitoring into a single modeling framework. Difference-in-difference analysis was conducted to identify whether those living in GMC demonstrated statistically larger reductions in air pollution exposure. The mixed effects D/S/A machine learning modeling result indicated that GMC had 2 ppb greater reductions in NO2 concentrations from pre- to post-policy period than far away areas. The difference-in-difference analysis demonstrated that the subjects living in GMC experienced statistically significant greater reductions in NO2 exposure than those living in the far away areas. This study contributes to scientific knowledge by providing empirical evidence that improvements in air quality via the emissions reductions plan policies impacted traffic-related air pollutant concentrations and associated exposures most among low-income Californians with chronic conditions living in GMC. The identified differences in pollutant reductions across different location domains may be applicable to other states or other countries if similar policies are enacted.
Collapse
Affiliation(s)
- Jason G Su
- Enviroinmental Health Sciences, School of Public Health, University of California, Berkeley, Berkeley, CA, USA.
| | - Ying-Ying Meng
- Center for Health Policy Research, Fielding School of Public Health, University of California, Los Angeles, Los Angeles, CA, USA
| | - Xiao Chen
- Center for Health Policy Research, Fielding School of Public Health, University of California, Los Angeles, Los Angeles, CA, USA
| | - John Molitor
- Biological and Population Health Sciences, College of Public Health and Human Sciences, Oregon State University, Corvallis, OR, USA
| | - Dahai Yue
- Center for Health Policy Research, Fielding School of Public Health, University of California, Los Angeles, Los Angeles, CA, USA
| | - Michael Jerrett
- Environmental Health Sciences, Fielding School of Public Health, University of California, Los Angeles, Los Angeles, CA, USA
| |
Collapse
|
29
|
A Comparative Analysis for Air Quality Estimation from Traffic and Meteorological Data. APPLIED SCIENCES-BASEL 2020. [DOI: 10.3390/app10134587] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Air pollution in urban regions remains a crucial subject of study, given its implications on health and environment, where much effort is often put into monitoring pollutants and producing accurate trend estimates over time, employing expensive tools and sensors. In this work, we study the problem of air quality estimation in the urban area of Milan (IT), proposing different machine learning approaches that combine meteorological and transit-related features to produce affordable estimates without introducing sensor measurements into the computation. We investigated different configurations employing machine and deep learning models, namely a linear regressor, an Artificial Neural Network using Bayesian regularization, a Random Forest regressor and a Long Short Term Memory network. Our experiments show that affordable estimation results over the pollutants can be achieved even with simpler linear models, therefore suggesting that reasonably accurate Air Quality Index (AQI) measurements can be obtained without the need for expensive equipment.
Collapse
|
30
|
Zhou Y, Chang LC, Chang FJ. Explore a Multivariate Bayesian Uncertainty Processor driven by artificial neural networks for probabilistic PM 2.5 forecasting. THE SCIENCE OF THE TOTAL ENVIRONMENT 2020; 711:134792. [PMID: 31812407 DOI: 10.1016/j.scitotenv.2019.134792] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/04/2019] [Revised: 09/10/2019] [Accepted: 10/01/2019] [Indexed: 06/10/2023]
Abstract
Quantifying predictive uncertainty inherent in the nonlinear multivariate dependence structure of multi-step-ahead PM2.5 forecasts is challenging. This study integrates a Multivariate Bayesian Uncertainty Processor (MBUP) and an artificial neural network (ANN) to make accurate probabilistic PM2.5 forecasts. The contributions of the proposed approach are two-fold. First, the MBUP can capture the nonlinear multivariate dependence structure between observed and forecasted data. Second, the MBUP can alleviate predictive uncertainty encountered in PM2.5 forecast models that are configured by ANNs. The reliability of the proposed approach was assessed by a case study on air quality in Taipei City of Taiwan. We consider forecasts of PM2.5 concentrations as a function of meteorological and air quality factors based on long-term (2010-2018) hourly observational datasets. Firstly, the Back Propagation Neural Network (BPNN) and the Adaptive Neural Fuzzy Inference System (ANFIS) were investigated to produce deterministic forecasts. Results revealed that the ANFIS model could learn different air pollutant emission mechanisms (i.e. primary, secondary and natural processes) from the clustering-based fuzzy inference system and produce more accurate deterministic forecasts than the BPNN. The ANFIS model then provided inputs (i.e. point estimates) to probabilistic forecast models. Next, two post-processing techniques (MBUP and the Univariate Bayesian Uncertainty Processor (UBUP)) were separately employed to produce probabilistic forecasts. The Bayesian Uncertainty Processors (BUPs) can model the dependence structure (i.e. posterior density function) between observed and forecasted data using a prior density function and a likelihood density function. Here in BUPs, the Monte Carlo simulation was introduced to create a probabilistic predictive interval of PM2.5 concentrations. The results demonstrated that the MBUP not only outperformed the UBUP but also suitably characterized the complex nonlinear multivariate dependence structure between observations and forecasts. Consequently, the proposed approach could reduce predictive uncertainty while significantly improving model reliability and PM2.5 forecast accuracy for future horizons.
Collapse
Affiliation(s)
- Yanlai Zhou
- Department of Bioenvironmental Systems Engineering, National Taiwan University, Taipei 10617, Taiwan; Department of Geosciences, University of Oslo, P.O. Box 1047, Blindern, N-0316 Oslo, Norway
| | - Li-Chiu Chang
- Department of Water Resources and Environmental Engineering, Tamkang University, New Taipei City 25137, Taiwan
| | - Fi-John Chang
- Department of Bioenvironmental Systems Engineering, National Taiwan University, Taipei 10617, Taiwan.
| |
Collapse
|
31
|
Lee M, Lin L, Chen CY, Tsao Y, Yao TH, Fei MH, Fang SH. Forecasting Air Quality in Taiwan by Using Machine Learning. Sci Rep 2020; 10:4153. [PMID: 32139787 PMCID: PMC7057956 DOI: 10.1038/s41598-020-61151-7] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2019] [Accepted: 02/20/2020] [Indexed: 01/09/2023] Open
Abstract
This study proposes a gradient-boosting-based machine learning approach for predicting the PM2.5 concentration in Taiwan. The proposed mechanism is evaluated on a large-scale database built by the Environmental Protection Administration, and Central Weather Bureau, Taiwan, which includes data from 77 air monitoring stations and 580 weather stations performing hourly measurements over 1 year. By learning from past records of PM2.5 and neighboring weather stations’ climatic information, the forecasting model works well for 24-h prediction at most air stations. This study also investigates the geographical and meteorological divergence for the forecasting results of seven regional monitoring areas. We also compare the prediction performance between Taiwan, Taipei, and London; analyze the impact of industrial pollution; and propose an enhanced version of the prediction model to improve the prediction accuracy. The results indicate that Taipei and London have similar prediction results because these two cities have similar topography (basin) and are financial centers without domestic pollution sources. The results also suggest that after considering industrial impacts by incorporating additional features from the Taichung and Thong-Siau power plants, the proposed method achieves significant improvement in the coefficient of determination (R2) from 0.58 to 0.71. Moreover, for Taichung City the root-mean-square error decreases from 8.56 for the conventional approach to 7.06 for the proposed method.
Collapse
Affiliation(s)
- Mike Lee
- Far Eastern Group, Taipei, Taiwan
| | - Larry Lin
- Department of Electrical Engineering, Yuan Ze University, Taoyuan City, Taiwan.,MOST Joint Research Center for AI Technology and All Vista Healthcare, Taipei, Taiwan
| | - Chih-Yuan Chen
- Department of Geography, Chinese Culture University, Taipei, Taiwan
| | - Yu Tsao
- Research Center for Information Technology Innovation, Academia Sinica, Taipei, Taiwan
| | - Ting-Hsuan Yao
- Department of Electrical Engineering, Yuan Ze University, Taoyuan City, Taiwan.,MOST Joint Research Center for AI Technology and All Vista Healthcare, Taipei, Taiwan
| | - Min-Han Fei
- Department of Electrical Engineering, Yuan Ze University, Taoyuan City, Taiwan.,MOST Joint Research Center for AI Technology and All Vista Healthcare, Taipei, Taiwan
| | - Shih-Hau Fang
- Department of Electrical Engineering, Yuan Ze University, Taoyuan City, Taiwan. .,MOST Joint Research Center for AI Technology and All Vista Healthcare, Taipei, Taiwan.
| |
Collapse
|
32
|
Rahman MM, Karunasinghe J, Clifford S, Knibbs LD, Morawska L. New insights into the spatial distribution of particle number concentrations by applying non-parametric land use regression modelling. THE SCIENCE OF THE TOTAL ENVIRONMENT 2020; 702:134708. [PMID: 31715399 DOI: 10.1016/j.scitotenv.2019.134708] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/11/2019] [Revised: 09/27/2019] [Accepted: 09/27/2019] [Indexed: 06/10/2023]
Abstract
Ambient particle number concentration (PNC) varies significantly in time and space within cities, yet complexity and cost prohibit large-scale routine monitoring; as a consequence, there is not enough data for assessment of human exposure to, or risk from the particles. The quality of assessments can be augmented by modelling; however, models are generally less capable of predicting PNC spatial variation than predicting variations in other ambient pollutants. To advance modelling of PNC, we aimed to develop and compare the performance of parametric and non-parametric machine learning land-use regression (LUR) models to predict hourly average PNC. We used data from 25 short-term stationary campaigns and five long-term sites during 2009-2012 in the Brisbane Metropolitan Area, Australia. We analysed three particle size ranges of total PNC (<30 nm, <414 nm and <3000 nm) as response variables, and over 150 independent variables, including land use, roads and traffic, population, distance, elevation, meteorology and time of day as potential predictors of PNC. The LUR models were developed separately for All Days, Nuc Days (when particle nucleation occurred), and No-nuc Days (when no particle nucleation occurred). We selected two algorithms to develop LUR models for PNC: a random forest (RF) model, and a generalised additive model (GAM) based on the least angle regression (LARS). The best LARS model for <30 nm, <414 nm and <3000 nm explained 30%, 31%, and 34%, respectively, whereas the best RF models were significantly better, explaining 73%, 64%, and 88%, respectively. Using this novel approach, we provided new insights into spatial variation in PNC and also demonstrated that the non-parametric RF model is a better choice for developing a LUR model for PNCs because of its robust predictive performance in comparison with the LARS parametric regression model.
Collapse
Affiliation(s)
- Md Mahmudur Rahman
- International Laboratory for Air Quality and Health, Queensland University of Technology, GPO Box 2434, Brisbane, QLD 4001, Australia; Climate and Atmospheric Science, Department of Planning, Industry and Environment, 480 Weeroona Road, Lidcombe, NSW 2141, Australia
| | - Jayanandana Karunasinghe
- International Laboratory for Air Quality and Health, Queensland University of Technology, GPO Box 2434, Brisbane, QLD 4001, Australia
| | - Sam Clifford
- International Laboratory for Air Quality and Health, Queensland University of Technology, GPO Box 2434, Brisbane, QLD 4001, Australia; Department of Infectious Disease Epidemiology, London School of Hygiene & Tropical Medicine, Keppel Street, London WC1E 7HT, UK
| | - Luke D Knibbs
- School of Public Health, The University of Queensland, Herston, QLD 4006, Australia
| | - Lidia Morawska
- International Laboratory for Air Quality and Health, Queensland University of Technology, GPO Box 2434, Brisbane, QLD 4001, Australia.
| |
Collapse
|
33
|
Deep Random Subspace Learning: A Spatial-Temporal Modeling Approach for Air Quality Prediction. ATMOSPHERE 2019. [DOI: 10.3390/atmos10090560] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
Decrease in air quality is one of the most crucial threats to human health. There is an imperative and necessary need for more accurate air quality prediction. To meet this need, we propose a novel long short-term memory-based deep random subspace learning (LSTM-DRSL) framework for air quality forecasting. Specifically, we incorporate real-time pollutant emission data into the model input. We also design a spatial-temporal analysis approach to make good use of these data. The prediction model is developed by combining random subspace learning with a deep learning algorithm in order to improve the prediction accuracy. Empirical analyses based on multiple datasets over China from January 2015 to September 2017 are performed to demonstrate the efficacy of the proposed framework for hourly pollutant concentration prediction at an urban-agglomeration scale. The empirical results indicate that our framework is a viable method for air quality prediction. With consideration of the regional scale, the LSTM-DRSL framework performs better at a relatively large regional scale (around 200–300 km). In addition, the quality of predictions is higher in industrial areas. From a temporal point of view, the LSTM-DRSL framework is more suitable for hourly predictions.
Collapse
|
34
|
Yousefzadeh M, Farnaghi M, Pilesjö P, Mansourian A. Proposing and investigating PCAMARS as a novel model for NO 2 interpolation. ENVIRONMENTAL MONITORING AND ASSESSMENT 2019; 191:183. [PMID: 30798406 PMCID: PMC6394563 DOI: 10.1007/s10661-019-7253-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/13/2018] [Accepted: 01/21/2019] [Indexed: 06/09/2023]
Abstract
Effective measurement of exposure to air pollution, not least NO2, for epidemiological studies along with the need to better management and control of air pollution in urban areas ask for precise interpolation and determination of the concentration of pollutants in nonmonitored spots. A variety of approaches have been developed and used. This paper aims to propose, develop, and test a spatial predictive model based on multivariate adaptive regression splines (MARS) and principle component analysis (PCA) to determine the concentration of NO2 in Tehran, as a case study. To increase the accuracy of the model, spatial data (population, road network and point of interests such as petroleum stations and green spaces) and meteorological data (including temperature, pressure, wind speed and relative humidity) have also been used as independent variables, alongside air quality measurement data gathered by the monitoring stations. The outputs of the proposed model are evaluated against reference interpolation techniques including inverse distance weighting, thin plate splines, kriging, cokriging, and MARS3. Interpolation for 12 months showed better accuracies of the proposed model in comparison with the reference methods.
Collapse
Affiliation(s)
- Mohsen Yousefzadeh
- Faculty of Geodesy and Geomatics Engineering, K. N. Toosi University of Technology, Tehran, Iran
| | - Mahdi Farnaghi
- Faculty of Geodesy and Geomatics Engineering, K. N. Toosi University of Technology, Tehran, Iran
- GIS Center, Department of Physical Geography and Ecosystem Science, Lund University, 22362 Lund, Sweden
| | - Petter Pilesjö
- GIS Center, Department of Physical Geography and Ecosystem Science, Lund University, 22362 Lund, Sweden
- Center for Middle-Eastern Studies, Lund University, Lund, Sweden
| | - Ali Mansourian
- GIS Center, Department of Physical Geography and Ecosystem Science, Lund University, 22362 Lund, Sweden
- Center for Middle-Eastern Studies, Lund University, Lund, Sweden
| |
Collapse
|
35
|
Shang Z, Deng T, He J, Duan X. A novel model for hourly PM 2.5 concentration prediction based on CART and EELM. THE SCIENCE OF THE TOTAL ENVIRONMENT 2019; 651:3043-3052. [PMID: 30463154 DOI: 10.1016/j.scitotenv.2018.10.193] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/22/2018] [Revised: 10/09/2018] [Accepted: 10/14/2018] [Indexed: 06/09/2023]
Abstract
Hourly PM2.5 concentrations have multiple change patterns. For hourly PM2.5 concentration prediction, it is beneficial to split the whole dataset into several subsets with similar properties and to train a local prediction model for each subset. However, the methods based on local models need to solve the global-local duality. In this study, a novel prediction model based on classification and regression tree (CART) and ensemble extreme learning machine (EELM) methods is developed to split the dataset into subsets in a hierarchical fashion and build a prediction model for each leaf. Firstly, CART is used to split the dataset by constructing a shallow hierarchical regression tree. Then at each node of the tree, EELM models are built using the training samples of the node, and hidden neuron numbers are selected to minimize validation errors respectively on the leaves of a sub-tree that takes the node as the root. Finally, for each leaf of the tree, a global and several local EELMs on the path from the root to the leaf are compared, and the one with the smallest validation error on the leaf is chosen. The meteorological data of Yancheng urban area and the air pollutant concentration data from City Monitoring Centre are used to evaluate the method developed. The experimental results demonstrate that the method developed addresses the global-local duality, having better performance than global models including random forest (RF), v-support vector regression (v-SVR) and EELM, and other local models based on season and k-means clustering. The new model has improved the capability of treating multiple change patterns.
Collapse
Affiliation(s)
- Zhigen Shang
- Department of Automation, Yancheng Institute of Technology, Yancheng 224051, China.
| | - Tong Deng
- The Wolfson Centre for Bulk Solids Handling Technology, Faculty of Engineering & Science, University of Greenwich, Kent ME4 4TB, UK
| | - Jianqiang He
- Department of Automation, Yancheng Institute of Technology, Yancheng 224051, China
| | - Xiaohui Duan
- Department of Automation, Yancheng Institute of Technology, Yancheng 224051, China
| |
Collapse
|
36
|
Retrieval of Daily PM2.5 Concentrations Using Nonlinear Methods: A Case Study of the Beijing–Tianjin–Hebei Region, China. REMOTE SENSING 2018. [DOI: 10.3390/rs10122006] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
Exposure to fine particulate matter (PM2.5) is associated with adverse health impacts on the population. Satellite observations and machine learning algorithms have been applied to improve the accuracy of the prediction of PM2.5 concentrations. In this study, we developed a PM2.5 retrieval approach using machine-learning methods, based on aerosol products from the Moderate Resolution Imaging Spectroradiometer (MODIS) aboard the NASA Earth Observation System (EOS) Terra and Aqua polar-orbiting satellites, near-ground meteorological variables from the NASA Goddard Earth Observing System (GEOS), and ground-based PM2.5 observation data. Four models, which are orthogonal regression (OR), regression tree (Rpart), random forests (RF), and support vector machine (SVM), were tested and compared in the Beijing–Tianjin–Hebei (BTH) region of China in 2015. Aerosol products derived from the Terra and Aqua satellite sensors were also compared. The 10-repeat 5-fold cross-validation (10 × 5 CV) method was subsequently used to evaluate the performance of the different aerosol products and the four models. The results show that the performance of the Aqua dataset was better than that of the Terra dataset, and that the RF algorithm has the best predictive performance (Terra: R = 0.77, RMSE = 43.51 μg/m3; Aqua: R = 0.85, RMSE = 33.90 μg/m3). This study shows promise for predicting the spatiotemporal distribution of PM2.5 using the RF model and Aqua aerosol product with the assistance of PM2.5 site data.
Collapse
|
37
|
Kang GK, Gao JZ, Chiao S, Lu S, Xie G. Air Quality Prediction: Big Data and Machine Learning Approaches. ACTA ACUST UNITED AC 2018. [DOI: 10.18178/ijesd.2018.9.1.1066] [Citation(s) in RCA: 61] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
38
|
Aznarte JL. Probabilistic forecasting for extreme NO 2 pollution episodes. ENVIRONMENTAL POLLUTION (BARKING, ESSEX : 1987) 2017; 229:321-328. [PMID: 28605719 DOI: 10.1016/j.envpol.2017.05.079] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/08/2016] [Revised: 05/23/2017] [Accepted: 05/28/2017] [Indexed: 06/07/2023]
Abstract
In this study, we investigate the convenience of quantile regression to predict extreme concentrations of NO2. Contrarily to the usual point-forecasting, where a single value is forecast for each horizon, probabilistic forecasting through quantile regression allows for the prediction of the full probability distribution, which in turn allows to build models specifically fit for the tails of this distribution. Using data from the city of Madrid, including NO2 concentrations as well as meteorological measures, we build models that predict extreme NO2 concentrations, outperforming point-forecasting alternatives, and we prove that the predictions are accurate, reliable and sharp. Besides, we study the relative importance of the independent variables involved, and show how the important variables for the median quantile are different than those important for the upper quantiles. Furthermore, we present a method to compute the probability of exceedance of thresholds, which is a simple and comprehensible manner to present probabilistic forecasts maximizing their usefulness.
Collapse
Affiliation(s)
- José L Aznarte
- Artificial Intelligence Department, Universidad Nacional de Educación a Distancia - UNED, c/ Juan del Rosal, 16, Madrid, Spain.
| |
Collapse
|
39
|
Integrating Statistical Machine Learning in a Semantic Sensor Web for Proactive Monitoring and Control. SENSORS 2017; 17:s17040807. [PMID: 28397776 PMCID: PMC5422168 DOI: 10.3390/s17040807] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/01/2016] [Revised: 01/11/2017] [Accepted: 01/24/2017] [Indexed: 11/16/2022]
Abstract
Proactive monitoring and control of our natural and built environments is important in various application scenarios. Semantic Sensor Web technologies have been well researched and used for environmental monitoring applications to expose sensor data for analysis in order to provide responsive actions in situations of interest. While these applications provide quick response to situations, to minimize their unwanted effects, research efforts are still necessary to provide techniques that can anticipate the future to support proactive control, such that unwanted situations can be averted altogether. This study integrates a statistical machine learning based predictive model in a Semantic Sensor Web using stream reasoning. The approach is evaluated in an indoor air quality monitoring case study. A sliding window approach that employs the Multilayer Perceptron model to predict short term PM2.5 pollution situations is integrated into the proactive monitoring and control framework. Results show that the proposed approach can effectively predict short term PM2.5 pollution situations: precision of up to 0.86 and sensitivity of up to 0.85 is achieved over half hour prediction horizons, making it possible for the system to warn occupants or even to autonomously avert the predicted pollution situations within the context of Semantic Sensor Web.
Collapse
|