1
|
Lloyd SD, Carvajal G, Campey M, Taylor N, Osmond P, Roser DJ, Khan SJ. Predicting recreational water quality and public health safety in urban estuaries using Bayesian Networks. WATER RESEARCH 2024; 254:121319. [PMID: 38422692 DOI: 10.1016/j.watres.2024.121319] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Revised: 02/05/2024] [Accepted: 02/14/2024] [Indexed: 03/02/2024]
Abstract
To support the reactivation of urban rivers and estuaries for bathing while ensuring public safety, it is critical to have access to real-time information on microbial water quality and associated health risks. Predictive modelling can provide this information, though challenges concerning the optimal size of training data, model transferability, and communication of uncertainty still need attention. Further, urban estuaries undergo distinctive hydrological variations requiring tailored modelling approaches. This study assessed the use of Bayesian Networks (BNs) for the prediction of enterococci exceedances and extrapolation of health risks at planned bathing sites in an urban estuary in Sydney, Australia. The transferability of network structures between sites was assessed. Models were validated using a novel application of the k-fold walk-forward validation procedure and further tested using independent compliance and event-based sampling datasets. Learning curves indicated the model's sensitivity reached a minimum performance threshold of 0.8 once training data included ≥ 400 observations. It was demonstrated that Semi-Naïve BN structures can be transferred while maintaining stable predictive performance. In all sites, salinity and solar exposure had the greatest influence on Posterior Probability Distributions (PPDs), when combined with antecedent rainfall. The BNs provided a novel and transparent framework to quantify and visualise enterococci, stormwater impact, health risks, and associated uncertainty under varying environmental conditions. This study has advanced the application of BNs in predicting recreational water quality and providing decision support in urban estuarine settings, proposed for bathing, where uncertainty is high.
Collapse
Affiliation(s)
- Simon D Lloyd
- School of Built Environment, University of New South Wales, NSW, Australia.
| | - Guido Carvajal
- Facultad de Ingeniería, Universidad Andrés Bello, Antonio Varas 880, Providencia, Santiago, Chile
| | - Meredith Campey
- Beachwatch, NSW Department of Planning and Environment, NSW, Australia
| | | | - Paul Osmond
- School of Built Environment, University of New South Wales, NSW, Australia
| | - David J Roser
- School of Civil and Environmental Engineering, University of New South Wales, NSW, Australia
| | - Stuart J Khan
- School of Civil Engineering, University of Sydney, NSW, Australia
| |
Collapse
|
2
|
Seis W, Veldhuis MCT, Rouault P, Steffelbauer D, Medema G. A new Bayesian approach for managing bathing water quality at river bathing locations vulnerable to short-term pollution. WATER RESEARCH 2024; 252:121186. [PMID: 38340453 DOI: 10.1016/j.watres.2024.121186] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Revised: 12/21/2023] [Accepted: 01/23/2024] [Indexed: 02/12/2024]
Abstract
Short-term fecal pollution events are a major challenge for managing microbial safety at recreational waters. Long turn-over times of current laboratory methods for analyzing fecal indicator bacteria (FIB) delay water quality assessments. Data-driven models have been shown to be valuable approaches to enable fast water quality assessments. However, a major barrier towards the wider use of such models is the prevalent data scarcity at existing bathing waters, which questions the representativeness and thus usefulness of such datasets for model training. The present study explores the ability of five data-driven modelling approaches to predict short-term fecal pollution episodes at recreational bathing locations under data scarce situations and imbalanced datasets. The study explicitly focuses on the potential benefits of adopting an innovative modeling and risk-based assessment approach, based on state/cluster-based Bayesian updating of FIB distributions in relation to different hydrological states. The models are benchmarked against commonly applied supervised learning approaches, particularly linear regression, and random forests, as well as to a zero-model which closely resembles the current way of classifying bathing water quality in the European Union. For model-based clustering we apply a non-parametric Bayesian approach based on a Dirichlet Process Mixture Model. The study tests and demonstrates the proposed approaches at three river bathing locations in Germany, known to be influenced by short-term pollution events. At each river two modelling experiments ("longest dry period", "sequential model training") are performed to explore how the different modelling approaches react and adapt to scarce and uninformative training data, i.e., datasets that do not include event pollution information in terms of elevated FIB concentrations. We demonstrate that it is especially the proposed Bayesian approaches that are able to raise correct warnings in such situations (> 90 % true positive rate). The zero-model and random forest are shown to be unable to predict contamination episodes if pollution episodes are not present in the training data. Our research shows that the investigated Bayesian approaches reduce the risk of missed pollution events, thereby improving bathing water safety management. Additionally, the approaches provide a transparent solution for setting minimum data quality requirements under various conditions. The proposed approaches open the way for developing data-driven models for bathing water quality prediction against the reality that data scarcity is common problem at existing and prospective bathing waters.
Collapse
Affiliation(s)
- Wolfgang Seis
- KWB Kompetenzzentrum Wasser Berlin gGmbH, Cicerostraße 24, Berlin 10709, Germany; Water Management Department, Faculty of Civil Engineering and Geosciences, Delft University of Technology, Stevinweg 1, Delft 2628 CN, the Netherlands.
| | - Marie-Claire Ten Veldhuis
- Water Management Department, Faculty of Civil Engineering and Geosciences, Delft University of Technology, Stevinweg 1, Delft 2628 CN, the Netherlands
| | - Pascale Rouault
- KWB Kompetenzzentrum Wasser Berlin gGmbH, Cicerostraße 24, Berlin 10709, Germany
| | - David Steffelbauer
- KWB Kompetenzzentrum Wasser Berlin gGmbH, Cicerostraße 24, Berlin 10709, Germany
| | - Gertjan Medema
- Water Management Department, Faculty of Civil Engineering and Geosciences, Delft University of Technology, Stevinweg 1, Delft 2628 CN, the Netherlands; KWR Water Research Institute, Groningenhaven 7, Nieuwegein 3433PE, the Netherlands
| |
Collapse
|
3
|
Anderson CE, Boehm AB. Sunlight Inactivation of Enveloped Viruses in Clear Water. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2023; 57:21395-21404. [PMID: 38062652 DOI: 10.1021/acs.est.3c06680] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/20/2023]
Abstract
Enveloped virus fate in the environment is not well understood; there are no quantitative data on sunlight inactivation of enveloped viruses in water. Herein, we measured the sunlight inactivation of two enveloped viruses (Phi6 and murine hepatitis virus, MHV) and a nonenveloped virus (MS2) over time in clear water with simulated sunlight exposure. We attenuated UV sunlight wavelengths using long-pass 50% cutoff filters at 280, 305, and 320 nm. With the lowest UV attenuation tested, all decay rate constants (corrected for UV light screening, k̂) were significantly different from dark controls; the MS2 k̂ was equal to 4.5 m2/MJ, compared to 16 m2/MJ for Phi6 and 52 m2/MJ for MHV. With the highest UV attenuation tested, only k̂ for MHV (6.1 m2/MJ) was different from the dark control. Results indicate that the two enveloped viruses decay faster than the nonenveloped virus studied, and k̂ are significantly impacted by UV attenuation. Differences in k̂ may be due to the presence of viral envelopes but may also be related to other differing intrinsic properties of the viruses, including genome length and composition. Reported k̂ values can inform strategies to reduce the risk from exposure to enveloped viruses in the environment.
Collapse
Affiliation(s)
- Claire E Anderson
- Department of Civil and Environmental Engineering, Stanford University, Stanford, California 94305, United States
| | - Alexandria B Boehm
- Department of Civil and Environmental Engineering, Stanford University, Stanford, California 94305, United States
| |
Collapse
|
4
|
Searcy RT, Boehm AB. Know Before You Go: Data-Driven Beach Water Quality Forecasting. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2023; 57:17930-17939. [PMID: 36472482 DOI: 10.1021/acs.est.2c05972] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
Forecasting environmental hazards is critical in preventing or building resilience to their impacts on human communities and ecosystems. Environmental data science is an emerging field that can be harnessed for forecasting, yet more work is needed to develop methodologies that can leverage increasingly large and complex data sets for decision support. Here, we design a data-driven framework that can, for the first time, forecast bacterial standard exceedances at marine beaches with 3 days lead time. Using historical data sets collected at two California sites, we train nearly 400 forecast models using statistical and machine learning techniques and test forecasts against predictions from both a naive "persistence" model and a baseline nowcast model. Overall, forecast models are found to have similar sensitivities and specificities to the persistence model, but significantly higher areas under the ROC curve (a metric distinguishing a model's ability to effectively parse classes across decision thresholds), suggesting that forecasts can provide enhanced information beyond past observations alone. Forecast model performance at all lead times was similar to that of nowcast models. Together, results suggest that integrating the forecasting framework developed in this study into beach management programs can enable better public notification and aid in proactive pollution and health risk management.
Collapse
Affiliation(s)
- Ryan T Searcy
- Department of Civil & Environmental Engineering, Stanford University, 473 Via Ortega, Stanford, California 94305, United States
| | - Alexandria B Boehm
- Department of Civil & Environmental Engineering, Stanford University, 473 Via Ortega, Stanford, California 94305, United States
| |
Collapse
|
5
|
Zimmer-Faust AG, Griffith JF, Steele JA, Santos B, Cao Y, Asato L, Chiem T, Choi S, Diaz A, Guzman J, Laak D, Padilla M, Quach-Cu J, Ruiz V, Woo M, Weisberg SB. Relationship between coliphage and Enterococcus at southern California beaches and implications for beach water quality management. WATER RESEARCH 2023; 230:119383. [PMID: 36630853 DOI: 10.1016/j.watres.2022.119383] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Revised: 11/08/2022] [Accepted: 11/17/2022] [Indexed: 06/17/2023]
Abstract
Coliphage have been suggested as an alternative to fecal indicator bacteria for assessing recreational beach water quality, but it is unclear how frequently and at what types of beaches coliphage produces a different management outcome. Here we conducted side-by-side sampling of male-specific and somatic coliphage by the new EPA dead-end hollow fiber ultrafiltration (D-HFUF-SAL) method and Enterococcus at southern California beaches over two years. When samples were combined for all beach sites, somatic and male-specific coliphage both correlated with Enterococcus. When examined categorically, Enterococcus would have resulted in approximately two times the number of health advisories as somatic coliphage and four times that of male-specific coliphage,using recently proposed thresholds of 60 PFU/100 mL for somatic and 30 PFU/100 mL for male-specific coliphage. Overall, only 12% of total exceedances would have been for coliphage alone. Somatic coliphage exceedances that occurred in the absence of an Enterococcus exceedance were limited to a single site during south swell events, when this beach is known to be affected by nearby minimally treated sewage. Thus, somatic coliphage provided additional valuable health protection information, but may be more appropriate as a supplement to FIB measurements rather than as replacement because: (a) EPA-approved PCR methods for Enterococcus allow a more rapid response, (b) coliphage is more challenging owing to its greater sampling volume and laboratory time requirements, and (c) Enterococcus' long data history has yielded predictive management models that would need to be recreated for coliphage.
Collapse
Affiliation(s)
- Amity G Zimmer-Faust
- Southern California Coastal Water Research Project Authority, 3535 Harbor Blvd., Costa Mesa, CA 92626, United States.
| | - John F Griffith
- Southern California Coastal Water Research Project Authority, 3535 Harbor Blvd., Costa Mesa, CA 92626, United States
| | - Joshua A Steele
- Southern California Coastal Water Research Project Authority, 3535 Harbor Blvd., Costa Mesa, CA 92626, United States
| | - Bryan Santos
- City of San Diego, Environmental Monitoring and Technical Services, United States
| | - Yiping Cao
- Orange County Sanitation District, United States
| | - Laralyn Asato
- City of San Diego, Environmental Monitoring and Technical Services, United States
| | - Tania Chiem
- Orange County Public Health Laboratory, United States
| | - Samuel Choi
- Orange County Sanitation District, United States
| | - Arturo Diaz
- Orange County Sanitation District, United States
| | - Joe Guzman
- Orange County Public Health Laboratory, United States
| | - David Laak
- Ventura County Public Works Agency, United States
| | | | | | - Victor Ruiz
- Los Angeles City Sanitation Department, United States
| | - Mary Woo
- California State University Channel Islands, Ventura, CA, United States
| | - Stephen B Weisberg
- Southern California Coastal Water Research Project Authority, 3535 Harbor Blvd., Costa Mesa, CA 92626, United States
| |
Collapse
|
6
|
Lučin I, Družeta S, Mauša G, Alvir M, Grbčić L, Lušić DV, Sikirica A, Kranjčević L. Predictive modeling of microbiological seawater quality in karst region using cascade model. THE SCIENCE OF THE TOTAL ENVIRONMENT 2022; 851:158009. [PMID: 35987218 DOI: 10.1016/j.scitotenv.2022.158009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/05/2022] [Revised: 08/06/2022] [Accepted: 08/09/2022] [Indexed: 06/15/2023]
Abstract
This paper presents an in-depth analysis of seawater quality measurements during the bathing seasons from year 2009 to 2020 in the city of Rijeka, Croatia. Due to rare occurrences of measurements with less than excellent water quality, considered dataset is deeply imbalanced. Additionally, it incorporates measurements under the influence of submerged groundwater discharges (SGD), which were observed in some bathing locations. These discharges were previously thought to dry up during the summer season and are now suspected to be one of the causes of increased Escherichia coli values. Consequently, and in view of the fact that the accuracy of prediction models can be significantly influenced by temporal and spatial variation of the input data, a novel cascade prediction modeling strategy was proposed. It consists of a sequence of prediction models which tend to identify general environmental conditions which confidently lead to excellent bathing water quality. The proposed model uses environmental features which can rather easily be estimated or obtained from the weather forecast. The model was trained on a highly biased dataset, consisting of data from locations with and without SGD influence, and for the time period spanning extremely dry and warm seasons, extremely wet seasons, as well as normal seasons. To simulate realistic application, the model was tested using temporal and spatial stratification of data. The cascade strategy was shown to be a good approach for reliably detecting environmental parameters which produce excellent water quality. Proposed model is designed as a filter method, where instances classified as less-than-excellent water quality require further analysis. The cascade model provides great flexibility as it can be customized to the particular needs of the investigated area and dataset specifics.
Collapse
Affiliation(s)
- Ivana Lučin
- Department of Fluid Mechanics and Computational Engineering, Faculty of Engineering, University of Rijeka, Vukovarska 58, Rijeka 51000, Croatia; Center for Advanced Computing and Modelling, University of Rijeka, Radmile Matejčić 2, Rijeka 51000, Croatia
| | - Siniša Družeta
- Department of Fluid Mechanics and Computational Engineering, Faculty of Engineering, University of Rijeka, Vukovarska 58, Rijeka 51000, Croatia; Center for Advanced Computing and Modelling, University of Rijeka, Radmile Matejčić 2, Rijeka 51000, Croatia
| | - Goran Mauša
- Department of Computer Engineering, Faculty of Engineering, University of Rijeka, Vukovarska 58, Rijeka 51000, Croatia; Center for Advanced Computing and Modelling, University of Rijeka, Radmile Matejčić 2, Rijeka 51000, Croatia
| | - Marta Alvir
- Department of Fluid Mechanics and Computational Engineering, Faculty of Engineering, University of Rijeka, Vukovarska 58, Rijeka 51000, Croatia
| | - Luka Grbčić
- Department of Fluid Mechanics and Computational Engineering, Faculty of Engineering, University of Rijeka, Vukovarska 58, Rijeka 51000, Croatia; Center for Advanced Computing and Modelling, University of Rijeka, Radmile Matejčić 2, Rijeka 51000, Croatia
| | - Darija Vukić Lušić
- Center for Advanced Computing and Modelling, University of Rijeka, Radmile Matejčić 2, Rijeka 51000, Croatia; Department of Environmental Health, Faculty of Medicine, University of Rijeka, Braće Branchetta 20/1, Rijeka 51000, Croatia; Department of Environmental Health, Teaching Institute of Public Health of Primorje-Gorski Kotar County, Krešimirova 52a, Rijeka 51000, Croatia
| | - Ante Sikirica
- Department of Fluid Mechanics and Computational Engineering, Faculty of Engineering, University of Rijeka, Vukovarska 58, Rijeka 51000, Croatia; Center for Advanced Computing and Modelling, University of Rijeka, Radmile Matejčić 2, Rijeka 51000, Croatia
| | - Lado Kranjčević
- Department of Fluid Mechanics and Computational Engineering, Faculty of Engineering, University of Rijeka, Vukovarska 58, Rijeka 51000, Croatia; Center for Advanced Computing and Modelling, University of Rijeka, Radmile Matejčić 2, Rijeka 51000, Croatia.
| |
Collapse
|
7
|
Li L, Qiao J, Yu G, Wang L, Li HY, Liao C, Zhu Z. Interpretable tree-based ensemble model for predicting beach water quality. WATER RESEARCH 2022; 211:118078. [PMID: 35066260 DOI: 10.1016/j.watres.2022.118078] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/14/2021] [Revised: 11/29/2021] [Accepted: 01/12/2022] [Indexed: 06/14/2023]
Abstract
Tree-based machine learning models based on environmental features offer low-cost and timely solutions for predicting microbial fecal contamination in beach water to inform the public of the health risk. However, many of these models are black boxes that are difficult for humans to understand, which may cause severe consequences such as unexplained decisions and failure in accountability. To develop interpretable predictive models for beach water quality, we evaluate five tree-based models, namely classification tree, random forest, CatBoost, XGBoost, and LightGBM, and employ a state-of-the-art explanation method SHAP to explain the models. When tested on the Escherichia coli (E. coli) concentration data collected from three beach sites along Lake Erie shores, LightGBM, followed by XGBoost, achieves the highest averaged precision and recall scores. For all three sites, both models suggest lake turbidity as the most important predictor, and elucidate the crucial role of accurate local data of wave height and rainfall in the model development. Local SHAP values further reveal the robustness of the importance of lake turbidity as its SHAP value increases nearly monotonically with its value and is minimally affected by other environmental factors. Moreover, we found an intriguing interaction between lake turbidity and day-of-year. This work suggests that the combination of LightGBM and SHAP has a promising potential to develop interpretable models for predicting microbial water quality in freshwater lakes.
Collapse
Affiliation(s)
- Lingbo Li
- Department of Civil, Structural and Environmental Engineering, University at Buffalo, The State University of New York, Buffalo, NY, USA
| | - Jundong Qiao
- Department of Civil, Structural and Environmental Engineering, University at Buffalo, The State University of New York, Buffalo, NY, USA
| | - Guan Yu
- Department of Biostatistics, University at Buffalo, The State University of New York, Buffalo, NY, USA
| | - Leizhi Wang
- Nanjing Hydraulic Research Institute, State Key laboratory of Hydrology, Water Resources and Hydraulic Engineering & Science, Nanjing 210029, China
| | - Hong-Yi Li
- Department of Civil and Environmental Engineering, University of Houston, Houston, TX, USA
| | - Chen Liao
- Program for Computational and Systems Biology, Memorial Sloan-Kettering Cancer Center, NY, USA.
| | - Zhenduo Zhu
- Department of Civil, Structural and Environmental Engineering, University at Buffalo, The State University of New York, Buffalo, NY, USA.
| |
Collapse
|
8
|
Sokolova E, Ivarsson O, Lillieström A, Speicher NK, Rydberg H, Bondelind M. Data-driven models for predicting microbial water quality in the drinking water source using E. coli monitoring and hydrometeorological data. THE SCIENCE OF THE TOTAL ENVIRONMENT 2022; 802:149798. [PMID: 34454142 DOI: 10.1016/j.scitotenv.2021.149798] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/08/2021] [Revised: 07/08/2021] [Accepted: 08/16/2021] [Indexed: 06/13/2023]
Abstract
Rapid changes in microbial water quality in surface waters pose challenges for production of safe drinking water. If not treated to an acceptable level, microbial pathogens present in the drinking water can result in severe consequences for public health. The aim of this paper was to evaluate the suitability of data-driven models of different complexity for predicting the concentrations of E. coli in the river Göta älv at the water intake of the drinking water treatment plant in Gothenburg, Sweden. The objectives were to (i) assess how the complexity of the model affects the model performance; and (ii) identify relevant factors and assess their effect as predictors of E. coli levels. To forecast E. coli levels one day ahead, the data on laboratory measurements of E. coli and total coliforms, Colifast measurements of E. coli, water temperature, turbidity, precipitation, and water flow were used. The baseline approaches included Exponential Smoothing and ARIMA (Autoregressive Integrated Moving Average), which are commonly used univariate methods, and a naive baseline that used the previous observed value as its next prediction. Also, models common in the machine learning domain were included: LASSO (Least Absolute Shrinkage and Selection Operator) Regression and Random Forest, and a tool for optimising machine learning pipelines - TPOT (Tree-based Pipeline Optimization Tool). Also, a multivariate autoregressive model VAR (Vector Autoregression) was included. The models that included multiple predictors performed better than univariate models. Random Forest and TPOT resulted in higher performance but showed a tendency of overfitting. Water temperature, microbial concentrations upstream and at the water intake, and precipitation upstream were shown to be important predictors. Data-driven modelling enables water producers to interpret the measurements in the context of what concentrations can be expected based on the recent historic data, and thus identify unexplained deviations warranting further investigation of their origin.
Collapse
Affiliation(s)
- Ekaterina Sokolova
- Chalmers University of Technology, Department of Architecture and Civil Engineering, Sweden.
| | - Oscar Ivarsson
- Chalmers University of Technology, Department of Computer Science and Engineering, Sweden
| | - Ann Lillieström
- Chalmers University of Technology, Department of Computer Science and Engineering, Sweden
| | - Nora K Speicher
- Chalmers University of Technology, Department of Computer Science and Engineering, Sweden
| | - Henrik Rydberg
- City of Gothenburg, Department of Sustainable Water and Waste, Sweden
| | - Mia Bondelind
- Chalmers University of Technology, Department of Architecture and Civil Engineering, Sweden
| |
Collapse
|
9
|
Guo J, Lee JHW. Development of Predictive Models for "Very Poor" Beach Water Quality Gradings Using Class-Imbalance Learning. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2021; 55:14990-15000. [PMID: 34634206 DOI: 10.1021/acs.est.1c03350] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Statistical water quality forecast models are useful tools to assist with beach management. In particular, multiple linear regression (MLR) models have been successfully developed for prediction of fecal indicator bacteria concentrations for beaches in river, lake, and marine environments. Nevertheless, an unresolved challenging issue is the reliable prediction of infrequent events of high bacterial concentrations to inform beach closure decisions to protect public health. The number of field data available for the infrequent events is typically an order of magnitude less than that for days when the water quality criterion is met-MLR models often perform poorly in predicting bacterial concentrations on days when the beaches should be closed. For beach management in Hong Kong, MLR models have been developed to predict beach water quality indices in terms of four gradings (BWQI-1 to 4) based on Escherichia coli (E. coli) concentrations. In this study, we propose an artificial intelligence (AI)-based binary classification (EasyEnsemble) model using class-imbalance learning to predict "very poor" occasions (BWQI-4)-when E. coli concentration exceeds 610 counts/100 mL. Models are developed for three marine beaches with different hydrographic and pollution characteristics using a 30 year data set spanning three periods with different water quality status. The model-data comparison over a wide range of conditions shows that the proposed method results in a significant improvement in the prediction of "very poor" water quality. The proposed class-imbalance method for predicting rare events has an F-score of 0.84, and it significantly outperforms MLR and classification tree (CT) models with corresponding F-scores of 0.39 and 0.69. A robust beach water quality forecast system can hence be developed using hybrid MLR-binary classification modeling.
Collapse
Affiliation(s)
- Jiuhao Guo
- Department of Civil and Environmental Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
| | - Joseph H W Lee
- Macao Environmental Research Institute, Macau University of Science and Technology, Taipa, Macao, China
- Institute for Advanced Study, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
| |
Collapse
|
10
|
Bourel M, Segura AM, Crisci C, López G, Sampognaro L, Vidal V, Kruk C, Piccini C, Perera G. Machine learning methods for imbalanced data set for prediction of faecal contamination in beach waters. WATER RESEARCH 2021; 202:117450. [PMID: 34352535 DOI: 10.1016/j.watres.2021.117450] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/11/2020] [Revised: 07/09/2021] [Accepted: 07/15/2021] [Indexed: 06/13/2023]
Abstract
Predicting water contamination by statistical models is a useful tool to manage health risk in recreational beaches. Extreme contamination events, i.e. those exceeding normative are generally rare with respect to bathing conditions and thus the data is said to be imbalanced. Modeling and predicting those rare events present unique challenges. Here we introduce and evaluate several machine learning techniques and metrics to model imbalanced data and evaluate model performance. We do so by using a) simulated data-sets and b) a real data base with records of faecal coliform abundance monitored for 10 years in 21 recreational beaches in Uruguay (N ≈ 19000) using in situ and meteorological variables. We discuss advantages and disadvantages of the methods and provide a simple guide to perform models for a general audience. We also provide R codes to reproduce model fitting and testing. We found that most Machine Learning techniques are sensitive to imbalance and require specific data pre-treatment (e.g. upsampling) to improve performance. Accuracy (i.e. correctly classified cases over total cases) is not adequate to evaluate model performance on imbalanced data set. Instead, true positive rates (TPR) and false positive rates (FPR) are recommended. Among the 52 possible candidate algorithms tested, the stratified Random forest presented the better performance improving TPR in 50% with respect to baseline (0.4) and outperformed baseline in the evaluated metrics. Support vector machines combined with upsampling method or synthetic minority oversampling technique (SMOTE) performed well, similar to Adaboost with SMOTE. These results suggests that combining modeling strategies is necessary to improve our capacity to anticipate water contamination and avoid health risk.
Collapse
Affiliation(s)
- Mathias Bourel
- IMERL, Facultad de Ingeniería, Universidad de la República, Montevideo, Uruguay; Departamento de Modelización Estadística de Datos e Inteligencia Artificial (MEDIA), Centro Universitario Regional Este, Universidad de la República, Rocha, Uruguay.
| | - Angel M Segura
- Departamento de Modelización Estadística de Datos e Inteligencia Artificial (MEDIA), Centro Universitario Regional Este, Universidad de la República, Rocha, Uruguay
| | - Carolina Crisci
- Departamento de Modelización Estadística de Datos e Inteligencia Artificial (MEDIA), Centro Universitario Regional Este, Universidad de la República, Rocha, Uruguay
| | - Guzmán López
- Departamento de Modelización Estadística de Datos e Inteligencia Artificial (MEDIA), Centro Universitario Regional Este, Universidad de la República, Rocha, Uruguay
| | - Lia Sampognaro
- Departamento de Modelización Estadística de Datos e Inteligencia Artificial (MEDIA), Centro Universitario Regional Este, Universidad de la República, Rocha, Uruguay
| | - Victoria Vidal
- Departamento de Modelización Estadística de Datos e Inteligencia Artificial (MEDIA), Centro Universitario Regional Este, Universidad de la República, Rocha, Uruguay
| | - Carla Kruk
- Departamento de Modelización Estadística de Datos e Inteligencia Artificial (MEDIA), Centro Universitario Regional Este, Universidad de la República, Rocha, Uruguay; Departamento de Microbiología, Instituto de Investigaciones Biológicas Clemente Estable, Ministerio de Educación y Cultura, Montevideo, Uruguay; Instituto de Ecología y Ciencias Ambientales, Facultad de Ciencias, Universidad de la República, Montevideo, Uruguay
| | - Claudia Piccini
- Departamento de Modelización Estadística de Datos e Inteligencia Artificial (MEDIA), Centro Universitario Regional Este, Universidad de la República, Rocha, Uruguay; Departamento de Microbiología, Instituto de Investigaciones Biológicas Clemente Estable, Ministerio de Educación y Cultura, Montevideo, Uruguay
| | - Gonzalo Perera
- Departamento de Modelización Estadística de Datos e Inteligencia Artificial (MEDIA), Centro Universitario Regional Este, Universidad de la República, Rocha, Uruguay
| |
Collapse
|
11
|
Wang L, Zhu Z, Sassoubre L, Yu G, Liao C, Hu Q, Wang Y. Improving the robustness of beach water quality modeling using an ensemble machine learning approach. THE SCIENCE OF THE TOTAL ENVIRONMENT 2021; 765:142760. [PMID: 33131841 DOI: 10.1016/j.scitotenv.2020.142760] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/15/2020] [Revised: 09/28/2020] [Accepted: 09/28/2020] [Indexed: 05/12/2023]
Abstract
Microbial pollution of beach water can expose swimmers to harmful pathogens. Predictive modeling provides an alternative method for beach management that addresses several limitations associated with traditional culture-based methods of assessing water quality. Widely-used machine learning methods often suffer from high variability in performance from one year or beach to another. Therefore, the best machine learning method varies between beaches and years, making method selection difficult. This study proposes an ensemble machine learning approach referred to as model stacking that has a two-layered learning structure, where the outputs of five widely-used individual machine learning models (multiple linear regression, partial least square, sparse partial least square, random forest, and Bayesian network) are taken as input features for another model that produces the final prediction. Applying this approach to three beaches along eastern Lake Erie, New York, USA, we show that generally the model stacking approach was able to generate reliably good predictions compared to all of the five base models. The accuracy rankings of the stacking model consistently stayed 1st or 2nd every year, with yearly-average accuracy of 78%, 81%, and 82.3% at the three studied beaches, respectively. This study highlights the value of the model stacking approach in predicting beach water quality and solving other pressing environmental problems.
Collapse
Affiliation(s)
- Leizhi Wang
- Department of Civil, Structural and Environmental Engineering, University at Buffalo, The State University of New York, Buffalo 14220, NY, USA; Nanjing Hydraulic Research Institute, State Key laboratory of Hydrology, Water Resources and Hydraulic Engineering & Science, Nanjing 210029, China; Yangtze Institute for Conservation and Development, Nanjing, 210098, China
| | - Zhenduo Zhu
- Department of Civil, Structural and Environmental Engineering, University at Buffalo, The State University of New York, Buffalo 14220, NY, USA.
| | - Lauren Sassoubre
- Department of Civil, Structural and Environmental Engineering, University at Buffalo, The State University of New York, Buffalo 14220, NY, USA
| | - Guan Yu
- Department of Biostatistics, University at Buffalo, The State University of New York, Buffalo 14220, NY, USA
| | - Chen Liao
- Program for Computational and Systems Biology, Memorial Sloan-Kettering Cancer Center, NY 10065, New York, USA
| | - Qingfang Hu
- Nanjing Hydraulic Research Institute, State Key laboratory of Hydrology, Water Resources and Hydraulic Engineering & Science, Nanjing 210029, China; Yangtze Institute for Conservation and Development, Nanjing, 210098, China
| | - Yintang Wang
- Nanjing Hydraulic Research Institute, State Key laboratory of Hydrology, Water Resources and Hydraulic Engineering & Science, Nanjing 210029, China; Yangtze Institute for Conservation and Development, Nanjing, 210098, China
| |
Collapse
|
12
|
Kelly E, Gidley M, Sinigalliano C, Kumar N, Solo-Gabriele HM. Impact of wastewater infrastructure improvements on beach water fecal indicator bacteria levels in Monroe County, Florida. THE SCIENCE OF THE TOTAL ENVIRONMENT 2021; 763:143024. [PMID: 33168244 DOI: 10.1016/j.scitotenv.2020.143024] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/02/2020] [Revised: 09/26/2020] [Accepted: 10/09/2020] [Indexed: 06/11/2023]
Abstract
The effects of wastewater infrastructure construction on regional and local environments is unknown. This project evaluated the effects of such projects in Monroe County, Florida, an area that had undergone regional wastewater infrastructure improvements. We used fecal indicator bacteria (FIB) (fecal coliform and enterococci), as a proxy indicator of beach water quality for an 18-year period of record. At the highest level of aggregation, FIBs for all 17 beaches within the county were combined to evaluate trends on a yearly basis. At the lower level, yearly FIB trends were evaluated for each beach separately. FIB data on infrastructure project period (categorical variables: before, during, and after construction), and the influences of environmental conditions (quantitative variables of rainfall and temperature) were also evaluated. In the multiple regression models, enterococci and fecal coliform were significantly associated with rainfall (24 h, p < 0.0001) and water temperature (p < 0.0001) when only the quantitative variables were considered. When both categorical and quantitative variables were considered, project period was significant for enterococci (p < 0.0001) and fecal coliform (p < 0.0001), as was 24 h lagged rainfall. Overall, the most significant factors for both fecal coliform and enterococci were rainfall and project period. Considering all beaches, infrastructure projects seem to have the collective desired effects in the years following construction, as there were decreased FIBs measured at beach sites. Only through the aggregation of all projects and measurements at all beach sites could the decreases in FIB levels be observed. Local analysis is needed to explain anomalies from these general trends for specific beaches. This understanding of FIBs, their responses to environmental and project factors, and the need for aggregated and local site analysis can provide guidance to managers at other locations with similar issues of failing wastewater infrastructure and frequent FIB exceedances.
Collapse
Affiliation(s)
- E Kelly
- University of Miami Leonard and Jayne Abess Center for Ecosystem Science and Policy, Coral Gables, FL, USA; University of Miami Department of Civil, Architectural and Environmental Engineering, Coral Gables, FL, USA; NSF NIEHS Oceans and Human Health Center, Rosenstiel School of Marine and Atmospheric Science, University of Miami, Miami, FL, USA
| | - M Gidley
- NSF NIEHS Oceans and Human Health Center, Rosenstiel School of Marine and Atmospheric Science, University of Miami, Miami, FL, USA; National Oceanic and Atmospheric Administration (NOAA) Atlantic Oceanographic and Meteorological Laboratory (AOML) Environmental Microbiology, Miami, FL, USA; University of Miami Cooperative Institute for Marine and Atmospheric Studies (CIMAS), Miami, FL, USA
| | - C Sinigalliano
- NSF NIEHS Oceans and Human Health Center, Rosenstiel School of Marine and Atmospheric Science, University of Miami, Miami, FL, USA; National Oceanic and Atmospheric Administration (NOAA) Atlantic Oceanographic and Meteorological Laboratory (AOML) Environmental Microbiology, Miami, FL, USA
| | - N Kumar
- University of Miami Department of Public Health Sciences, Division of Environment & Public Health, Miami, FL, USA
| | - H M Solo-Gabriele
- University of Miami Leonard and Jayne Abess Center for Ecosystem Science and Policy, Coral Gables, FL, USA; University of Miami Department of Civil, Architectural and Environmental Engineering, Coral Gables, FL, USA; NSF NIEHS Oceans and Human Health Center, Rosenstiel School of Marine and Atmospheric Science, University of Miami, Miami, FL, USA.
| |
Collapse
|
13
|
Searcy RT, Boehm AB. A Day at the Beach: Enabling Coastal Water Quality Prediction with High-Frequency Sampling and Data-Driven Models. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2021; 55:1908-1918. [PMID: 33471505 DOI: 10.1021/acs.est.0c06742] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
To reduce the incidence of recreational waterborne illness, fecal indicator bacteria (FIB) are measured to assess water quality and inform beach management. Recently, predictive FIB models have been used to aid managers in making beach posting and closure decisions. However, those predictive models must be trained using rich historical data sets consisting of FIB and environmental data that span years, and many beaches lack such data sets. Here, we investigate whether water quality data collected during discrete short duration, high-frequency beach sampling events (e.g., samples collected at sub-hourly intervals for 24-48 h) are sufficient to train predictive models that can be used for beach management. We use data collected during six high-frequency sampling events at three California marine beaches and train a total of 126 models using common data-driven techniques. Tide, solar irradiation, water temperature, significant wave height, and offshore wind speed were found to be the most important environmental variables in the models. We validate the predictive performance of models using withheld data. Random forests are consistently the top performing model type. Overall, we find that data-driven models trained using high-frequency FIB and environmental data perform well at predicting water quality and can be used to inform public health decisions at beaches.
Collapse
Affiliation(s)
- Ryan T Searcy
- Department of Civil & Environmental Engineering, Stanford University, 473 Via Ortega, Stanford, Palo Alto 94305, California, United States
| | - Alexandria B Boehm
- Department of Civil & Environmental Engineering, Stanford University, 473 Via Ortega, Stanford, Palo Alto 94305, California, United States
| |
Collapse
|
14
|
Madani M, Seth R. Evaluating multiple predictive models for beach management at a freshwater beach in the Great Lakes region. JOURNAL OF ENVIRONMENTAL QUALITY 2020; 49:896-908. [PMID: 33016491 DOI: 10.1002/jeq2.20107] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/07/2020] [Revised: 05/10/2020] [Accepted: 05/14/2020] [Indexed: 06/11/2023]
Abstract
Recreational water quality is currently monitored at Sandpoint Beach on Lake St. Clair using culture-based enumeration of Escherichia coli. Using water quality and weather data collected over 4 yr, several multiple linear regression (MLR)-based models were developed for near real-time prediction of E. coli concentration and were tested using independent data from the fifth year. Model performance was assessed by the determination of metrics such as RMSE, accuracy, specificity, sensitivity, and area under the receiver operating characteristic curve (AUROC). Each of the developed MLR models described herein resulted in increased correct responses for both exceedance and non-exceedance of the applicable standard as compared to predictions based on E. coli measurements (persistence models, using the previous day's E. coli concentration), which is the method currently being used. The AUROC values for persistence models are between 0.5 and 0.6, as compared to >0.7 for all the MLR models described herein. Among the MLR models, model performance improved when qualitative sky weather condition, which is commonly reported but was not previously used in similar models, was included. To select the best model, a principal coordinate analysis was used to combine multiple model performance metrics and provide a more sensitive tool for model comparison. Although models developed using 2, 3, and 4 yr of monitoring data provided reasonable performance, the model developed using the most recent 2-yr data was marginally better. Thus, data from the most recent 2 yr are likely sufficient as a training dataset for updating the MLR model for Sandpoint Beach in the future.
Collapse
Affiliation(s)
- Mohammad Madani
- Dep. of Civil and Environmental Engineering, Univ. of Windsor, Windsor, ON, N9B3P4, Canada
| | - Rajesh Seth
- Dep. of Civil and Environmental Engineering, Univ. of Windsor, Windsor, ON, N9B3P4, Canada
| |
Collapse
|
15
|
Cazals M, Stott R, Fleury C, Proulx F, Prévost M, Servais P, Dorner S, Burnet JB. Near real-time notification of water quality impairments in recreational freshwaters using rapid online detection of β-D-glucuronidase activity as a surrogate for Escherichia coli monitoring. THE SCIENCE OF THE TOTAL ENVIRONMENT 2020; 720:137303. [PMID: 32145611 DOI: 10.1016/j.scitotenv.2020.137303] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/13/2019] [Revised: 02/12/2020] [Accepted: 02/12/2020] [Indexed: 06/10/2023]
Abstract
Waterborne disease outbreaks associated with recreational waters continue to be reported around the world despite existing microbiological water quality monitoring frameworks. Most regulations resort to the use of culture-based enumeration of faecal indicator bacteria such as Escherichia coli to protect bathers from gastrointestinal illness risks. However, the long sample-to-result time of standard culture-based assays (minimum 18-24 h) and infrequent regulatory sampling (weekly or less) do not enable detection of episodic water quality impairments and associated public health risks. The objective of this study was to assess the suitability of an autonomous online technology measuring β-D-glucuronidase (GLUC) activity for near real-time monitoring of microbiological water quality in recreational waters and for the resulting beach management decisions. GLUC activity and E. coli concentrations were monitored at three freshwater sites in Quebec, Canada (sites Qc1-3) and one site in New Zealand (site NZ) between 2016 and 2018. We found site-dependent linear relationships between GLUC activity and E. coli concentrations and using confusion matrices, we developed site-specific GLUC activity beach action values (BAVs) matching the regulatory E. coli BAVs. Using the regulatory E. coli BAV as the gold standard, rates of false alarms (unnecessary beach advisories using GLUC activity BAV) and failures to act (failure to trigger advisories using GLUC activity) ranged between 0 and 32% and between 3 and 10%, respectively, which is comparable to the rates reported in other studies using qPCR-defined BAVs. However, a major benefit of the autonomous enzymatic technology is the real-time reporting of threshold exceedances, while temporal trends in GLUC activity can assist in understanding the underlying dynamics of faecal pollution and potential health risks. Our study is the first to describe the applicability of online near real-time monitoring of microbiological water quality as a tool for improved beach management and public health protection.
Collapse
Affiliation(s)
- Margot Cazals
- Canada Research Chair in Source Water Protection, Department of Civil, Geological, and Mining Engineering, Polytechnique Montréal, Montréal, Québec H3T 1J4, Canada
| | - Rebecca Stott
- National Institute of Water and Atmospheric Research (NIWA), Gate 10, Silverdale Road, Hillcrest, Hamilton 3251, New Zealand
| | - Carole Fleury
- Service de l'eau, Direction de L'épuration des Eaux Usées, Montréal, Québec H1C 1V3, Canada
| | - François Proulx
- Service du Traitement des Eaux, Quebec City, Quebec G1N 3X6, Canada
| | - Michèle Prévost
- NSERC Industrial Chair on Drinking Water, Department of Civil, Geological, and Mining Engineering, Polytechnique Montréal, Montréal, Québec H3T 1J4, Canada
| | - Pierre Servais
- Écologie des Systèmes Aquatiques, Université Libre de Bruxelles, Campus de la Plaine, CP 221, Boulevard du Triomphe, B-1050 Brussels, Belgium
| | - Sarah Dorner
- Canada Research Chair in Source Water Protection, Department of Civil, Geological, and Mining Engineering, Polytechnique Montréal, Montréal, Québec H3T 1J4, Canada
| | - Jean-Baptiste Burnet
- Canada Research Chair in Source Water Protection, Department of Civil, Geological, and Mining Engineering, Polytechnique Montréal, Montréal, Québec H3T 1J4, Canada; NSERC Industrial Chair on Drinking Water, Department of Civil, Geological, and Mining Engineering, Polytechnique Montréal, Montréal, Québec H3T 1J4, Canada.
| |
Collapse
|
16
|
Aguilera R, Gershunov A, Benmarhnia T. Atmospheric rivers impact California's coastal water quality via extreme precipitation. THE SCIENCE OF THE TOTAL ENVIRONMENT 2019; 671:488-494. [PMID: 30933803 DOI: 10.1016/j.scitotenv.2019.03.318] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/11/2019] [Revised: 03/15/2019] [Accepted: 03/20/2019] [Indexed: 06/09/2023]
Abstract
Precipitation in California is projected to become more volatile: less frequent but more extreme as global warming pushes midlatitude frontal cyclones further poleward while bolstering the atmospheric rivers (ARs), which tend to produce the region's extreme rainfall. Pollutant accumulation and delivery to coastal waters can be expected to increase, as lengthening dry spells will be increasingly punctuated by more extreme precipitation events. Coastal pollution exposes human populations to high levels of fecal bacteria and associated pathogens, which can cause a variety of health impacts. Consequently, studying the impact of atmospheric rivers as the mechanism generating pulses of water pollution in coastal areas is relevant for public health and in the context of climate change. We aimed to quantify the links between precipitation events and water quality in order to explore meteorological causes as first steps toward effective early warning systems for the benefit of population health in California and beyond. We used historical gridded daily precipitation and weekly multiple fecal bacteria indicators at ~500 monitoring locations in California's coastal waters to identify weekly associations between precipitation and water quality during 2003-09 using canonical correlation analysis to account for the nested/clustered nature of longitudinal data. We then quantified, using a recently published catalog of atmospheric rivers, the proportion of coastal pollution events attributable to ARs. Association between precipitation and fecal bacteria was strongest in Southern California. Over two-thirds of coastal water pollution spikes exceeding one standard deviation were associated with ARs. This work highlights the importance of skillful AR landfall predictions in reducing vulnerability to extreme weather improving resilience of human populations in a varying and changing climate. Quantifying the impacts of ARs on waterborne diseases is important for planning effective preventive strategies for public health.
Collapse
Affiliation(s)
- Rosana Aguilera
- Scripps Institution of Oceanography, University of California San Diego, La Jolla, CA, USA.
| | - Alexander Gershunov
- Scripps Institution of Oceanography, University of California San Diego, La Jolla, CA, USA
| | - Tarik Benmarhnia
- Scripps Institution of Oceanography, University of California San Diego, La Jolla, CA, USA; Department of Family Medicine and Public Health, University of California San Diego, La Jolla, CA, USA
| |
Collapse
|
17
|
Quilliam RS, Taylor J, Oliver DM. The disparity between regulatory measurements of E. coli in public bathing waters and the public expectation of bathing water quality. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2019; 232:868-874. [PMID: 30530277 DOI: 10.1016/j.jenvman.2018.11.138] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/28/2018] [Revised: 11/07/2018] [Accepted: 11/28/2018] [Indexed: 06/09/2023]
Abstract
The main objectives of the European Union (EU) Bathing Water Directive (BWD) 2006/7/EC are to safeguard public health and protect designated aquatic environments from microbial pollution. The BWD is implemented through legislation by individual EU Member States and uses faecal indicator organisms (FIOs) as microbial pollution compliance parameters to determine season-end bathing water classifications (either 'Excellent', 'Good', 'Sufficient' or 'Poor'). These classifications are based on epidemiological studies that have linked human exposure to FIOs with the risk of contracting a gastrointestinal illness (GI). However, understanding public attitudes towards bathing water quality, together with perceptions of relative exposure risks, is often overlooked and yet critically important for informing environmental management decisions at the beach and ensuring effective risk communication. Therefore, this study aimed to determine the effectiveness of current regulatory strategies for informing beach users about bathing water quality, and to assess public understanding of the BWD classifications in terms of exposure risk and public health. Two UK designated bathing waters were selected as case studies, and questionnaires were deployed to beach-users. The bathing waters had different classification histories and both had electronic signage in operation for communicating daily water quality predictions. The majority of respondents did not recognise the standardised EU bathing water quality classification signs, and were unaware of information boards or the electronic signs predicting the water quality on that particular day. In general, respondents perceived the bathing water at their respective beach to be either 'good' or 'sufficient', which were also the lowest classifications of water quality they would be willing to accept for bathing. However, the lowest level of risk of contracting a gastrointestinal illness that respondents would be willing to accept suggested a significant misunderstanding of the BWD classification system, with the majority (91%) of respondents finding only a <1% risk level acceptable. The 'Good' classification is much less stringent in terms of likelihood of GI. This study has shown that the current public understanding of the BWD classifications in terms of exposure risk and public health is limited, and an investment in methods for disseminating information to the public is needed in order to allow beach-users to make more informed decisions about using bathing waters.
Collapse
Affiliation(s)
- Richard S Quilliam
- Biological and Environmental Sciences, Faculty of Natural Sciences, University of Stirling, FK9 4LA, UK.
| | - Jessica Taylor
- Biological and Environmental Sciences, Faculty of Natural Sciences, University of Stirling, FK9 4LA, UK
| | - David M Oliver
- Biological and Environmental Sciences, Faculty of Natural Sciences, University of Stirling, FK9 4LA, UK
| |
Collapse
|
18
|
Searcy RT, Taggart M, Gold M, Boehm AB. Implementation of an automated beach water quality nowcast system at ten California oceanic beaches. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2018; 223:633-643. [PMID: 29975890 DOI: 10.1016/j.jenvman.2018.06.058] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/13/2018] [Revised: 06/12/2018] [Accepted: 06/17/2018] [Indexed: 06/08/2023]
Abstract
Fecal indicator bacteria like Escherichia coli and entercococci are monitored at beaches around the world to reduce incidence of recreational waterborne illness. Measurements are usually made weekly, but FIB concentrations can exhibit extreme variability, fluctuating at shorter periods. The result is that water quality has likely changed by the time data are provided to beachgoers. Here, we present an automated water quality prediction system (called the nowcast system) that is capable of providing daily predictions of water quality for numerous beaches. We created nowcast models for 10 California beaches using weather, oceanographic, and other environmental variables as input to tuned regression models to predict if FIB concentrations were above single sample water quality standards. Rainfall was used as a variable in nearly every model. The models were calibrated and validated using historical data. Subsequently, models were implemented during the 2017 swim season in collaboration with local beach managers. During the 2017 swim season, the median sensitivity of the nowcast models was 0.5 compared to 0 for the current method of using day-to-week old measurements to make beach posting decisions. Model specificity was also high (median of 0.87). During the implementation phase, nowcast models provided an average of 140 additional days per beach of updated water quality information to managers when water quality measurements were not made. The work presented herein emphasizes that a one-size-fits all approach to nowcast modeling, even when beaches are in close proximity, is infeasible. Flexibility in modeling approaches and adaptive responses to modeling and data challenges are required when implementing nowcast models for beach management.
Collapse
Affiliation(s)
- Ryan T Searcy
- Heal the Bay, 1444 9th Street, Santa Monica, CA 90401, USA
| | - Mitzy Taggart
- Heal the Bay, 1444 9th Street, Santa Monica, CA 90401, USA
| | - Mark Gold
- UCLA, 2248 Murphy Hall, 410 Charles E. Young Drive East, Los Angeles, CA 90095, USA
| | - Alexandria B Boehm
- Department of Civil and Environmental Engineering, Stanford University, 473 Via Ortega, Stanford, CA, 94305, USA.
| |
Collapse
|
19
|
Zhang J, Qiu H, Li X, Niu J, Nevers MB, Hu X, Phanikumar MS. Real-Time Nowcasting of Microbiological Water Quality at Recreational Beaches: A Wavelet and Artificial Neural Network-Based Hybrid Modeling Approach. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2018; 52:8446-8455. [PMID: 29957996 DOI: 10.1021/acs.est.8b01022] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
The number of beach closings caused by bacterial contamination has continued to rise in recent years, putting beachgoers at risk of exposure to contaminated water. Current approaches predict levels of indicator bacteria using regression models containing a number of explanatory variables. Data-based modeling approaches can supplement routine monitoring data and provide highly accurate short-term forecasts of beach water quality. In this paper, we apply the nonlinear autoregressive network with exogenous inputs (NARX) method with explanatory variables to predict Escherichia coli concentrations at four Lake Michigan beach sites. We also apply the nonlinear input-output network (NIO) and nonlinear autoregressive neural network (NAR) methods in addition to a hybrid wavelet-NAR (WA-NAR) model and demonstrate their application. All models were tested using 3 months of observed data. Results revealed that the NARX models provided the best performance and that the WA-NAR model, which requires no explanatory variables, outperformed the NIO and NAR models; therefore, the WA-NAR model is suitable for application to data scarce regions. The models proposed in this paper were evaluated using multiple performance metrics, including sensitivity and specificity measures, and produced results comparable or superior to those of previous mechanistic and statistical models developed for the same beach sites. The relatively high R2 values between data and the NARX models ( R2 values of ∼0.8 for the beach sites and ∼0.9 for the river site) indicate that the new class of models shows promise for beach management.
Collapse
Affiliation(s)
- Juan Zhang
- Institute of Groundwater and Earth Sciences , Jinan University , Guangzhou 510632 , China
| | - Han Qiu
- Department of Civil and Environmental Engineering , Michigan State University , East Lansing , Michigan 48824 , United States
| | - Xiaoyu Li
- Department of Mathematics and Statistics , Auburn University , Auburn , Alabama 36849 , United States
| | - Jie Niu
- Institute of Groundwater and Earth Sciences , Jinan University , Guangzhou 510632 , China
| | - Meredith B Nevers
- USGS Great Lakes Science Center , Lake Michigan Ecological Research Station , Chesterton , Indiana 46304 , United States
| | - Xiaonong Hu
- Institute of Groundwater and Earth Sciences , Jinan University , Guangzhou 510632 , China
| | - Mantha S Phanikumar
- Department of Civil and Environmental Engineering , Michigan State University , East Lansing , Michigan 48824 , United States
| |
Collapse
|
20
|
Jennings WC, Chern EC, O'Donohue D, Kellogg MG, Boehm AB. Frequent detection of a human fecal indicator in the urban ocean: environmental drivers and covariation with enterococci. ENVIRONMENTAL SCIENCE. PROCESSES & IMPACTS 2018; 20:480-492. [PMID: 29404550 PMCID: PMC6686843 DOI: 10.1039/c7em00594f] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
Fecal pollution of surface waters presents a global human health threat. New molecular indicators of fecal pollution have been developed to address shortcomings of traditional culturable fecal indicators. However, there is still little information on their fate and transport in the environment. The present study uses spatially and temporally extensive data on traditional (culturable enterococci, cENT) and molecular (qPCR-enterococci, qENT and human-associated marker, HF183/BacR287) indicator concentrations in marine water surrounding highly-urbanized San Francisco, California, USA to investigate environmental and anthropogenic processes that impact fecal pollution. We constructed multivariable regression models for fecal indicator bacteria at 14 sampling stations. The human marker was detected more frequently in our study than in many other published studies, with detection frequency at some stations as high as 97%. The odds of cENT, qENT, and HF183/BacR287 exceeding health-relevant thresholds were statistically elevated immediately following discharges of partially treated combined sewage, and cENT levels dissipated after approximately 1 day. However, combined sewer discharges were not important predictors of indicator levels typically measured in weekly monitoring samples. Instead, precipitation and solar insolation were important predictors of cENT in weekly samples, while precipitation and water temperature were important predictors of HF183/BacR287 and qENT. The importance of precipitation highlights the significance of untreated storm water as a source of fecal pollution to the urban ocean, even for a city served by a combined sewage system. Sunlight and water temperature likely control persistence of the indicators via photoinactivation and dark decay processes, respectively.
Collapse
Affiliation(s)
- Wiley C Jennings
- Department of Civil and Environmental Engineering, Environmental Engineering and Science, Stanford University, 94305-4020, USA.
| | - Eunice C Chern
- San Francisco Public Utilities Commission, Water Quality Laboratory, 1000 El Camino Real, Millbrae, CA 94030, USA and EPA Region 10 Laboratory, 7411 Beach Dr E, Port Orchard, WA 98366, USA
| | - Diane O'Donohue
- San Francisco Public Utilities Commission, Oceanside Biology Laboratory, 3500 Great Highway, San Francisco, CA 94132, USA
| | - Michael G Kellogg
- San Francisco Public Utilities Commission, Oceanside Biology Laboratory, 3500 Great Highway, San Francisco, CA 94132, USA
| | - Alexandria B Boehm
- Department of Civil and Environmental Engineering, Environmental Engineering and Science, Stanford University, 94305-4020, USA.
| |
Collapse
|
21
|
McClary JS, Boehm AB. Transcriptional Response of Staphylococcus aureus to Sunlight in Oxic and Anoxic Conditions. Front Microbiol 2018; 9:249. [PMID: 29599752 PMCID: PMC5863498 DOI: 10.3389/fmicb.2018.00249] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2017] [Accepted: 01/31/2018] [Indexed: 12/20/2022] Open
Abstract
The transcriptional response of Staphylococcus aureus strain Newman to sunlight exposure was investigated under both oxic and anoxic conditions using RNA sequencing to gain insight into potential mechanisms of inactivation. S. aureus is a pathogenic bacterium detected at recreational beaches which can cause gastrointestinal illness and skin infections, and is of increasing public health concern. To investigate the S. aureus photostress response in oligotrophic seawater, S. aureus cultures were suspended in seawater and exposed to full spectrum simulated sunlight. Experiments were performed under oxic or anoxic conditions to gain insight into the effects of oxygen-mediated and non-oxygen-mediated inactivation mechanisms. Transcript abundance was measured after 6 h of sunlight exposure using RNA sequencing and was compared to transcript abundance in paired dark control experiments. Culturable S. aureus decayed following biphasic inactivation kinetics with initial decay rate constants of 0.1 and 0.03 m2 kJ−1 in oxic and anoxic conditions, respectively. RNA sequencing revealed that 71 genes had different transcript abundance in the oxic sunlit experiments compared to dark controls, and 18 genes had different transcript abundance in the anoxic sunlit experiments compared to dark controls. The majority of genes showed reduced transcript abundance in the sunlit experiments under both conditions. Three genes (ebpS, NWMN_0867, and NWMN_1608) were found to have the same transcriptional response to sunlight between both oxic and anoxic conditions. In the oxic condition, transcripts associated with porphyrin metabolism, nitrate metabolism, and membrane transport functions were increased in abundance during sunlight exposure. Results suggest that S. aureus responds differently to oxygen-dependent and oxygen-independent photostress, and that endogenous photosensitizers play an important role during oxygen-dependent indirect photoinactivation.
Collapse
Affiliation(s)
- Jill S McClary
- Civil and Environmental Engineering, Stanford University, Stanford, CA, United States
| | - Alexandria B Boehm
- Civil and Environmental Engineering, Stanford University, Stanford, CA, United States
| |
Collapse
|
22
|
Partyka ML, Bond RF, Chase JA, Atwill ER. Spatial and temporal variability of bacterial indicators and pathogens in six California reservoirs during extreme drought. WATER RESEARCH 2018; 129:436-446. [PMID: 29179123 DOI: 10.1016/j.watres.2017.11.038] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/14/2017] [Revised: 10/10/2017] [Accepted: 11/15/2017] [Indexed: 06/07/2023]
Abstract
California has one of the largest systems of surface water reservoirs in the world, providing irrigation water to California's agriculturally productive Central Valley. Irrigation water is recognized as a vehicle for the microbial contamination of raw produce and must be monitored according to new federal regulation. The purpose of this study was to further understanding of the variability of fecal indicator bacteria (Escherichia coli and fecal coliforms) and pathogens (E. coli O157:H7 (O157), non-O157 Shiga toxin-producing E. coli (STEC) and Salmonella) along both horizontal and vertical profiles within California reservoirs. Monthly sampling was conducted in six reservoirs located in the foothills of the Western Sierra Nevada during the summer irrigation season and extreme drought conditions of 2014 (n = 257). Concentrations of fecal indicator bacteria were highly variable between reservoirs (p < 0.05) and along the horizontal profile (p < 0.001) from upstream to downstream, with higher concentrations typically found outside of the reservoirs than within. Though many of the reservoirs were thermally stratified, bacterial concentrations were not associated with water temperature (p > 0.05) or any one particular depth strata (p < 0.05). However, prevalence of Salmonella and STEC (16/70 and 9/70 respectively) was higher in the deep strata than in mid or surface layers. We found no statistical association between samples collected downstream of reservoirs and those from the reservoirs themselves. Continued monitoring and modeling of both bacterial indicators and enteric pathogens are critical to our ability to estimate the risk of surface irrigation water supplies and make appropriate management decisions.
Collapse
Affiliation(s)
- Melissa L Partyka
- Western Center for Food Safety, School of Veterinary Medicine, University of California, Davis, USA.
| | - Ronald F Bond
- Western Center for Food Safety, School of Veterinary Medicine, University of California, Davis, USA
| | - Jennifer A Chase
- Western Center for Food Safety, School of Veterinary Medicine, University of California, Davis, USA
| | - Edward R Atwill
- Western Center for Food Safety, School of Veterinary Medicine, University of California, Davis, USA
| |
Collapse
|
23
|
He C, Post Y, Dony J, Edge T, Patel M, Rochfort Q. A physical descriptive model for predicting bacteria level variation at a dynamic beach. JOURNAL OF WATER AND HEALTH 2016; 14:617-629. [PMID: 27441857 DOI: 10.2166/wh.2016.206] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
A rational-based physical descriptive model (PDM) has been developed to predict the levels of Escherichia coli in water at a beach with dynamic conditions in the Greater Toronto Area (GTA), Ontario, Canada. Bacteria loadings in the water were affected not only by multiple physical factors (precipitation, discharge, wind, etc.), but also by cumulative effects, intensity, duration and timing of storm events. These may not be linearly related to the observed variations in bacteria levels, and are unlikely to be properly represented by a widely used multiple linear regression model. In order to account for these complex relationships, the amounts of precipitation and nearby creek discharge, the impact of various time-related factors, lag time between events and sample collection, and threshold for different parameters were used in determining bacteria levels. This new comprehensive PDM approach improved the accuracy of the E. coli level predictions in the studied beach water compared to the previously developed statistical predictive and presently used geometric mean models. In spite of the complexity and dynamic conditions at the studied beach, the PDM achieved 75% accuracy overall for the five case years examined.
Collapse
Affiliation(s)
- Cheng He
- National Water Research Institute, Environment Canada, 867 Lakeshore Road, Burlington, Ontario, Canada L7R 4A6 E-mail:
| | - Yvonne Post
- Department of Environmental Engineering, University of Guelph, 50 Stone Road East, Guelph, Ontario, Canada N1G 2W1
| | - John Dony
- Department of Environmental Engineering, University of Guelph, 50 Stone Road East, Guelph, Ontario, Canada N1G 2W1
| | - Tom Edge
- National Water Research Institute, Environment Canada, 867 Lakeshore Road, Burlington, Ontario, Canada L7R 4A6 E-mail:
| | - Mahesh Patel
- Toronto Public Health, 399 The West Mall, Toronto, Ontario, Canada M9C 2Y2
| | - Quintin Rochfort
- National Water Research Institute, Environment Canada, 867 Lakeshore Road, Burlington, Ontario, Canada L7R 4A6 E-mail:
| |
Collapse
|
24
|
Thoe W, Choi KW, Lee JHW. Predicting 'very poor' beach water quality gradings using classification tree. JOURNAL OF WATER AND HEALTH 2016; 14:97-108. [PMID: 26837834 DOI: 10.2166/wh.2015.094] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
A beach water quality prediction system has been developed in Hong Kong using multiple linear regression (MLR) models. However, linear models are found to be weak at capturing the infrequent 'very poor' water quality occasions when Escherichia coli (E. coli) concentration exceeds 610 counts/100 mL. This study uses a classification tree to increase the accuracy in predicting the 'very poor' water quality events at three Hong Kong beaches affected either by non-point source or point source pollution. Binary-output classification trees (to predict whether E. coli concentration exceeds 610 counts/100 mL) are developed over the periods before and after the implementation of the Harbour Area Treatment Scheme, when systematic changes in water quality were observed. Results show that classification trees can capture more 'very poor' events in both periods when compared to the corresponding linear models, with an increase in correct positives by an average of 20%. Classification trees are also developed at two beaches to predict the four-category Beach Water Quality Indices. They perform worse than the binary tree and give excessive false alarms of 'very poor' events. Finally, a combined modelling approach using both MLR model and classification tree is proposed to enhance the beach water quality prediction system for Hong Kong.
Collapse
Affiliation(s)
- Wai Thoe
- Department of Civil and Environmental Engineering, Environmental and Water Studies, Stanford University, Stanford, CA 94305, USA E-mail:
| | - King Wah Choi
- Department of Civil and Environmental Engineering, Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong
| | - Joseph Hun-wei Lee
- Department of Civil and Environmental Engineering, Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong
| |
Collapse
|