Can routine monitoring of E. coli fully account for peak event concentrations at drinking water intakes in agricultural and urban rivers?
Graphical abstract
Introduction
The World Health Organization’s (WHO) water quality guidelines recommend a preventive and risk-based approach for drinking water quality management. For this purpose, a spectrum of microbial risk assessment approaches is available, from simple sanitary inspections and risk matrices to more complex ones such as quantitative microbial risk assessment (QMRA) (WHO, 2016b). The QMRA approach can provide relative estimates of microbial risks at drinking water treatment plants (DWTPs), which may be particularly useful to prioritize investments in improving water treatment or in implementing source water protection measures. However, in many situations, data on pathogen occurrence and concentrations in raw water are not available at DWTPs, and only faecal indicator bacteria (FIB) are measured to characterize source water quality. Therefore, to support the implementation of a source-to-tap approach, simplified classification methods, known as “bin classification” were developed to determine minimum treatment requirements according to a specific level of FIB concentrations in raw water. Different summary statistics (e.g., mean concentration, maximum concentration) and sampling strategies (e.g., weekly sampling, monthly sampling, event-based sampling) are specified in regulatory requirements worldwide for bin classification (Table A1).
The arithmetic mean is a valid metric for characterizing microbial concentrations in order to characterize the risk of multiple exposures to low doses of pathogens (Haas, 1996). The annual mean is usually considered in QMRA because annual health-based targets are recommended in guidelines and regulations (Sinclair et al., 2015). Precise estimation of the mean is challenging for surface water because microbial concentrations can vary over several orders of magnitude within hours or days (Burnet et al., 2019a). This metric relies on the law of large numbers; as the sample size grows, its sample mean gets closer to the true mean. However, the meaning of “large” depends on the distribution of the data. The convergence is much faster for normal or thin-tailed distribution than for heavy-tailed distributions. If the variance is very large, any new observation can be large enough to overwhelm all previous observations, regardless of the number of accumulated observations.
Numerous studies have shown that heavy rainfall can rapidly increase microbial contamination loads in water (Atherholt et al., 1998, Burnet et al., 2014, 2019a; Kistemann et al., 2002; Signor et al., 2005). In urban areas, combined sewer overflow (CSO) discharges induced by heavy rainfall or snowmelt events can cause recurring microbial peaks in raw water at DWTPs (Jalliffier-Verne et al., 2016; Madoux-Humery et al., 2016). In agricultural areas, similar heavy rainfall episodes can increase microbial contamination of surface waters as a result of overland transport, tile drainage systems and resuspension from stream sediments (e.g., Dorner et al., 2006). A statistical approach was proposed to incorporate such peak events in the risk assessment for a hypothetical frequency of occurrence (Petterson et al., 2006). However, methods for the estimation of peak event frequency have not been proposed yet. Stochastic models are used in other fields to evaluate the frequency of extreme precipitation events (Katz et al., 2002), streamflow peaks (Katz et al., 2002), and extreme pollution from runoff (Harremoës, 1988). These models have implications for quantifying the frequency of extreme events at DWTPs but have not been utilized in the context of microbial safety of drinking water.
In Quebec, Canada, raw water E. coli concentrations are measured since 2013 at least weekly for large DWTPs (>10,000 inhabitants). These extensive datasets provide a unique opportunity to study temporal variations in different catchments. The objective of the study was to first develop a methodology to correctly estimate the mean E. coli concentrations in surface drinking water sources by considering peak events. Large routine monitoring datasets from six DWTPs were fitted with parametric distributions having different upper tail behaviors. For the best-fit distributions, we then evaluated the required minimum sample size to estimate the mean concentration for different ranges of uncertainty. Secondly, key contributors to the mean concentration level were identified by examining the influence of seasonality and hydrometeorological factors on temporal variations. Finally, we conducted event-based sampling during two hydrometeorological events at an urban DWTP to evaluate whether the selected parametric distributions predicted these targeted peak events. Implications for the development of sampling strategies and probabilistic models are discussed for setting health-based drinking water treatment requirements.
Section snippets
Study sites
Six DWTPs fed by rivers located in urban and agricultural catchments were selected and classified by the mean annual river flow rate in ascending order from A to D (Table 1). DWTP C2 is located downstream DWTP C1. Wastewater treatment plants (WWTPs), CSO discharge points, and the dominant land cover type were identified for areas 15 km upstream of the drinking water intakes. In Quebec, CSOs are equipped with recording devices to measure the frequency and the daily cumulative duration of
Descriptive statistics
The sample mean of E. coli concentrations in raw water varied between DWTPs, from 22 to 507 E. coli/100 mL (Table 3). Overall, the mean and the mean absolute deviation (MAD) decreased with the mean flow rate of the river. A 0.2 log10 increase in the mean and MAD was observed between DWTP C1 and C2. DWTPs B, C2, and E displayed the highest SD to MAD ratio. The kurtosis was greater than 155 at DWTPs C2 and E, but was only 25 at DWTP B, indicating potential bimodality of the empirical distribution.
Optimizing distribution selection to describe E. coli variations in source water
Candidate parametric distributions were selected to fit raw water E. coli measurements. The following underlying generative processes of distributions were considered for the selection. The combination of small-scale processes at a higher aggregate scale tends to yield a common probability distribution consistent with given constraints that maximize the entropy (Frank, 2009). A maximization of entropy with a constraint on the arithmetic mean results in an exponential distribution. If the
Conclusions
We have shown that it is possible to use simple parametric models and graphical tools to consider different tail behaviors for the evaluation of the mean E. coli concentration in raw water. The application of this approach to large data sets collected with routine and event-based monitoring strategies at six drinking water treatment plants located in different types of catchments demonstrated that:
- •
Weekly sampling for two years in urban catchments and for four years in agricultural catchments
Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
References (55)
- et al.
Spatial and temporal distribution of Cryptosporidium and Giardia in a drinking water resource: implications for monitoring and risk assessment
Sci. Total Environ.
(2014) - et al.
Tracking the contribution of multiple raw and treated wastewater discharges at an urban drinking water supply using near real-time monitoring of β-d-glucuronidase activity
Water Res.
(2019) - et al.
Autonomous online measurement of β-D-glucuronidase activity in surface water: is it suitable for rapid E. coli monitoring?
Water Res.
(2019) Measurement of inequality
Handb. Income Distrib.
(2000)How to average microbial densities to characterize risk
Water Res.
(1996)Stochastic models for estimation of extreme pollution from urban runoff
Water Res.
(1988)- et al.
Cumulative effects of fecal contamination from combined sewer overflows: management for source water protection
J. Environ. Manag.
(2016) - et al.
Statistics of extremes in hydrology
Adv. Water Resour.
(2002) - et al.
The effects of combined sewer overflow events on riverine sources of drinking water
Water Res.
(2016) - et al.
Normalized truncated Levy walks applied to the study of financial indices
Phys. A Stat. Mech. Appl.
(2007)
Standard Methods for the Examination of Water and Wastewater
Effect of rainfall on Giardia and Crypto
J. Am. Water Works Assoc.
Multimodel inference: understanding AIC and BIC in model selection
Sociol. Methods Res.
Canada-wide Strategy for the Management of Municipal Wastewater Effluent
The association between extreme precipitation and waterborne disease outbreaks in the United States, 1948–1994
Am. J. Public Health
Hydrologic modeling of pathogen fate and transport
Environ. Sci. Technol.
The common patterns of nature
J. Evol. Biol.
How to read probability distributions as statements about process
Entropy
Inference from simulations and monitoring convergence
Handbook of markov chain monte carlo
Bayesian Data Analysis
Règlement sur les ouvrages municipaux d’assainissement des eaux usées
Règlement sur le prélèvement des eaux et leur protection
Importance of distributional form in characterizing inputs to Monte Carlo risk assessments
Risk Anal.
Walkerton: lessons learned in comparison with waterborne outbreaks in the developed world
J. Environ. Eng. Sci.
Microbial load of drinking water reservoir tributaries during extreme rainfall and runoff
Appl. Environ. Microbiol.
Analytic approach to the problem of convergence of truncated Lévy flights towards the Gaussian stochastic process
Phy. Rev. E
Rapid analysis of β-D-glucuronidase activity in water using fully automated technology
Water Resour. Manag.
Cited by (19)
Fecal coliform distribution and health risk assessment in surface water in an urban-intensive catchment
2022, Journal of HydrologyCitation Excerpt :The model has been applied by many researchers (Haas et al., 2000; Machdar et al., 2013; Murphy et al., 2016; Uprety et al., 2020; van Lieverloo et al., 2007) to the assessment of E. coli, particularly E. coli O157:H7. However, the application of QMRA based on surface water quality still needs to be further improved because microbial concentrations can vary by several orders of magnitude within hours or days so that any previous observations can be overthrown by new values (Burnet et al., 2019; Sylvestre et al., 2020). In terms of risk representation, the probability of annual illness risk (Pill_year) computed by the QMRA model could not provide conclusive evidence of the occurrence of health damage or the severity and duration of illnesses (Gao et al., 2015).
Using surrogate data to assess risks associated with microbial peak events in source water at drinking water treatment plants
2021, Water ResearchCitation Excerpt :The parameters of the lognormal distributions were inferred from E. coli data collected weekly for 5 years (2013–2017) and Cryptosporidium data collected monthly for approximately 2 years between 2012 and 2016. Estimations and inferences were carried out in a Bayesian framework using Markov Chain Monte Carlo (MCMC), as previously described (Sylvestre et al., 2020a,c). EPs were computed for daily mean source water Cryptosporidium concentrations adjusted for the recovery efficiency and daily mean source water E. coli concentrations.