1
|
Li W, Zhao Y, Zhu Y, Dong Z, Wang F, Huang F. Research progress in water quality prediction based on deep learning technology: a review. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2024; 31:26415-26431. [PMID: 38538994 DOI: 10.1007/s11356-024-33058-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Accepted: 03/20/2024] [Indexed: 05/04/2024]
Abstract
Water, an invaluable and non-renewable resource, plays an indispensable role in human survival and societal development. Accurate forecasting of water quality involves early identification of future pollutant concentrations and water quality indices, enabling evidence-based decision-making and targeted environmental interventions. The emergence of advanced computational technologies, particularly deep learning, has garnered considerable interest among researchers for applications in water quality prediction because of its robust data analytics capabilities. This article comprehensively reviews the deployment of deep learning methodologies in water quality forecasting, encompassing single-model and mixed-model approaches. Additionally, we delineate optimization strategies, data fusion techniques, and other factors influencing the efficacy of deep learning-based water quality prediction models, because understanding and mastering these factors are crucial for accurate water quality prediction. Although challenges such as data scarcity, long-term prediction accuracy, and limited deployments of large-scale models persist, future research aims to address these limitations by refining prediction algorithms, leveraging high-dimensional datasets, evaluating model performance, and broadening large-scale model application. These efforts contribute to precise water resource management and environmental conservation.
Collapse
Affiliation(s)
- Wenhao Li
- School of Electrical and Automation Engineering, Nanjing Normal University, Nanjing, China
- Jiangsu Province Engineering Research Center of Environmental Risk Prevention and Emergency Response Technology, School of Environment, Nanjing, 210023, China
| | - Yin Zhao
- School of Electrical and Automation Engineering, Nanjing Normal University, Nanjing, China
| | - Yining Zhu
- Jiangsu Province Engineering Research Center of Environmental Risk Prevention and Emergency Response Technology, School of Environment, Nanjing, 210023, China
- Key Laboratory for Soft Chemistry and Functional Materials of Ministry of Education, Nanjing University of Science and Technology, Nanjing, 210094, Jiangsu, China
| | - Zhongtian Dong
- Key Laboratory for Soft Chemistry and Functional Materials of Ministry of Education, Nanjing University of Science and Technology, Nanjing, 210094, Jiangsu, China
| | - Fenghe Wang
- Jiangsu Province Engineering Research Center of Environmental Risk Prevention and Emergency Response Technology, School of Environment, Nanjing, 210023, China
- Key Laboratory for Soft Chemistry and Functional Materials of Ministry of Education, Nanjing University of Science and Technology, Nanjing, 210094, Jiangsu, China
| | - Fengliang Huang
- School of Electrical and Automation Engineering, Nanjing Normal University, Nanjing, China.
- Jiangsu Province Engineering Research Center of Environmental Risk Prevention and Emergency Response Technology, School of Environment, Nanjing, 210023, China.
| |
Collapse
|
2
|
Dhandapani A, Iqbal J, Kumar RN. Application of machine learning (individual vs stacking) models on MERRA-2 data to predict surface PM 2.5 concentrations over India. CHEMOSPHERE 2023; 340:139966. [PMID: 37634588 DOI: 10.1016/j.chemosphere.2023.139966] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Revised: 07/31/2023] [Accepted: 08/24/2023] [Indexed: 08/29/2023]
Abstract
The spatial coverage of PM2.5 monitoring is non-uniform across India due to the limited number of ground monitoring stations. Alternatively, Modern-Era Retrospective Analysis for Research and Applications, Version 2 (MERRA-2), is an atmospheric reanalysis data used for estimating PM2.5. MERRA-2 does not explicitly measure PM2.5 but rather follows an empirical model. MERRA-2 data were spatiotemporally collocated with ground observation for validation across India. Significant underestimation in MERRA-2 prediction of PM2.5 was observed over many monitoring stations ranging from -20 to 60 μg m-3. The utility of Machine Learning (ML) models to overcome this challenge was assessed. MERRA-2 aerosol and meteorological parameters were the input features used to train and test the individual ML models and compare them with the stacking technique. Initially, with 10% of randomly selected data, individual model performance was assessed to identify the best model. XGBoost (XGB) was the best model (r2 = 0.73) compared to Random Forest (RF) and LightGBM (LGBM). Stacking was then applied by keeping XGB as a meta-regressor. Stacked model results (r2 = 0.77) outperformed the best standalone estimate of XGB. Stacking technique was used to predict hourly and daily PM2.5 in different regions across India and each monitoring station. The eastern region exhibited the best hourly prediction (r2 = 0.80) and substantial reduction in Mean Bias (MB = -0.03 μg m-3), followed by the northern region (r2 = 0.63 and MB = -0.10 μg m-3), which showed better output due to the frequent observation of PM2.5 >100 μg m-3. Due to sparse data availability to train the ML models, the lowest performance was for the central region (r2 = 0.46 and MB = -0.60 μg m-3). Overall, India's PM2.5 prediction was good on an hourly basis compared to a daily basis using the ML stacking technique.
Collapse
Affiliation(s)
- Abisheg Dhandapani
- Department of Civil and Environmental Engineering, Birla Institute of Technology, Mesra, Ranchi, 835215, Jharkhand, India
| | - Jawed Iqbal
- Department of Civil and Environmental Engineering, Birla Institute of Technology, Mesra, Ranchi, 835215, Jharkhand, India
| | - R Naresh Kumar
- Department of Civil and Environmental Engineering, Birla Institute of Technology, Mesra, Ranchi, 835215, Jharkhand, India.
| |
Collapse
|
3
|
Li W, Zhang M, Cai S, Wu L, Li C, He Y, Yang G, Wang J, Pan Y. Neural network-based prognostic predictive tool for gastric cardiac cancer: the worldwide retrospective study. BioData Min 2023; 16:21. [PMID: 37464415 DOI: 10.1186/s13040-023-00335-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Accepted: 07/03/2023] [Indexed: 07/20/2023] Open
Abstract
BACKGROUNDS The incidence of gastric cardiac cancer (GCC) has obviously increased recently with poor prognosis. It's necessary to compare GCC prognosis with other gastric sites carcinoma and set up an effective prognostic model based on a neural network to predict the survival of GCC patients. METHODS In the population-based cohort study, we first enrolled the clinical features from the Surveillance, Epidemiology and End Results (SEER) data (n = 31,397) as well as the public Chinese data from different hospitals (n = 1049). Then according to the diagnostic time, the SEER data were then divided into two cohorts, the train cohort (patients were diagnosed as GCC in 2010-2014, n = 4414) and the test cohort (diagnosed in 2015, n = 957). Age, sex, pathology, tumor, node, and metastasis (TNM) stage, tumor size, surgery or not, radiotherapy or not, chemotherapy or not and history of malignancy were chosen as the predictive clinical features. The train cohort was utilized to conduct the neural network-based prognostic predictive model which validated by itself and the test cohort. Area under the receiver operating characteristics curve (AUC) was used to evaluate model performance. RESULTS The prognosis of GCC patients in SEER database was worse than that of non GCC (NGCC) patients, while it was not worse in the Chinese data. The total of 5371 patients were used to conduct the model, following inclusion and exclusion criteria. Neural network-based prognostic predictive model had a satisfactory performance for GCC overall survival (OS) prediction, which owned 0.7431 AUC in the train cohort (95% confidence intervals, CI, 0.7423-0.7439) and 0.7419 in the test cohort (95% CI, 0.7411-0.7428). CONCLUSIONS GCC patients indeed have different survival time compared with non GCC patients. And the neural network-based prognostic predictive tool developed in this study is a novel and promising software for the clinical outcome analysis of GCC patients.
Collapse
Affiliation(s)
- Wei Li
- Cancer Research Center, Beijing Chest Hospital, Capital Medical University, Beijing Tuberculosis and Thoracic Tumor Research Institute, No.9 Beiguan Street, Tongzhou District, Beijing, 101149, China
| | - Minghang Zhang
- Cancer Research Center, Beijing Chest Hospital, Capital Medical University, Beijing Tuberculosis and Thoracic Tumor Research Institute, No.9 Beiguan Street, Tongzhou District, Beijing, 101149, China
| | - Siyu Cai
- Dermatology Department, General Hospital of Western Theater Command, No.270 Tianhui Road, Chengdu, 610083, Sichuan Province, China
| | - Liangliang Wu
- Institute of Oncology, Senior Department of Oncology, the First Medical Center of Chinese CLA General Hospital, No.28 Fuxing Road, Haidian District, Beijing, 100853, China
| | - Chao Li
- Department of Gastroenterology, Peking University Aerospace School of Clinical Medicine, No.15 Yuquan Road, Haidian District, Beijing, 100049, China
| | - Yuqi He
- Department of Gastroenterology, Beijing Chest Hospital, Capital Medical University, Beijing Tuberculosis and Thoracic Tumor Research Institute, No.9 Beiguan Street, Tongzhou District, Beijing, 101149, China
| | - Guibin Yang
- Department of Gastroenterology, Peking University Aerospace School of Clinical Medicine, No.15 Yuquan Road, Haidian District, Beijing, 100049, China
| | - Jinghui Wang
- Cancer Research Center, Beijing Chest Hospital, Capital Medical University, Beijing Tuberculosis and Thoracic Tumor Research Institute, No.9 Beiguan Street, Tongzhou District, Beijing, 101149, China.
| | - Yuanming Pan
- Cancer Research Center, Beijing Chest Hospital, Capital Medical University, Beijing Tuberculosis and Thoracic Tumor Research Institute, No.9 Beiguan Street, Tongzhou District, Beijing, 101149, China.
| |
Collapse
|
4
|
Yang R, Liu H, Li Y. Quantifying uncertainty of marine water quality forecasts for environmental management using a dynamic multi-factor analysis and multi-resolution ensemble approach. CHEMOSPHERE 2023; 331:138831. [PMID: 37137396 DOI: 10.1016/j.chemosphere.2023.138831] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 04/25/2023] [Accepted: 04/30/2023] [Indexed: 05/05/2023]
Abstract
Unpredictable climate change and human activities pose enormous challenges to assessing the water quality components in the marine environment. Accurately quantifying the uncertainty of water quality forecasts can help decision-makers implement more scientific water pollution management strategies. This work introduces a new method of uncertainty quantification driven by point prediction for solving the engineering problem of water quality forecasting under the influence of complex environmental factors. The constructed multi-factor correlation analysis system can dynamically adjust the combined weight of environmental indicators according to the performance, thereby increasing the interpretability of data fusion. The designed singular spectrum analysis is utilized to reduce the volatility of the original water quality data. The real-time decomposition technique cleverly avoids the problem of data leakage. The multi-resolution-multi-objective optimization ensemble method is adopted to absorb the characteristics of different resolution data, so as to mine deeper potential information. Experimental studies are conducted using 6 actual water quality high-resolution signals with 21,600 sampling points from the Pacific islands and corresponding low-resolution signals with 900 sampling points, including temperature, salinity, turbidity, chlorophyll, dissolved oxygen, and oxygen saturation. The results illustrate that the model is superior to the existing model in quantifying the uncertainty of water quality prediction.
Collapse
Affiliation(s)
- Rui Yang
- Institute of Artificial Intelligence and Robotics (IAIR), Key Laboratory of Traffic Safety on Track of Ministry of Education, School of Traffic and Transportation Engineering, Central South University, Changsha, 410075, Hunan, China
| | - Hui Liu
- Institute of Artificial Intelligence and Robotics (IAIR), Key Laboratory of Traffic Safety on Track of Ministry of Education, School of Traffic and Transportation Engineering, Central South University, Changsha, 410075, Hunan, China.
| | - Yanfei Li
- School of Mechatronic Engineering, Hunan Agricultural University, Changsha, 410128, Hunan, China
| |
Collapse
|