Elsevier

Water Research

Volume 170, 1 March 2020, 115369
Water Research

Can routine monitoring of E. coli fully account for peak event concentrations at drinking water intakes in agricultural and urban rivers?

https://doi.org/10.1016/j.watres.2019.115369Get rights and content

Highlights

  • Variability and uncertainty in Escherichia coli concentrations were modeled.

  • Critical periods of microbial contamination of raw water were identified.

  • Parametric models could predict peak Escherichia coli concentrations.

  • Long-term routine monitoring accounted for peak events at drinking water intakes.

Abstract

In several jurisdictions, the arithmetic mean of Escherichia coli concentrations in raw water serves as the metric to set minimal treatment requirements by drinking water treatment plants (DWTPs). An accurate and precise estimation of this mean is therefore critical to define adequate requirements. Distributions of E. coli concentrations in surface water can be heavily skewed and require statistical methods capable of characterizing uncertainty. We present four simple parametric models with different upper tail behaviors (gamma, log-normal, Lomax, mixture of two log-normal distributions) to explicitly account for the influence of peak events on the mean concentration. The performance of these models was tested using large E. coli data sets (200–1800 samples) from raw water regulatory monitoring at six DWTPs located in urban and agricultural catchments. Critical seasons of contamination and hydrometeorological factors leading to peak events were identified. Event-based samples were collected at an urban DWTP intake during two hydrometeorological events using online β-d-glucuronidase activity monitoring as a trigger. Results from event-based sampling were used to verify whether selected parametric distributions predicted targeted peak events. We found that the upper tail of the log-normal and the Lomax distributions better predicted large concentrations than the upper tail of the gamma distribution. Weekly sampling for two years in urban catchments and for four years in agricultural catchments generated reasonable estimates of the average raw water E. coli concentrations. The proposed methodology can be easily used to inform the development of sampling strategies and statistical indices to set site-specific treatment requirements.

Introduction

The World Health Organization’s (WHO) water quality guidelines recommend a preventive and risk-based approach for drinking water quality management. For this purpose, a spectrum of microbial risk assessment approaches is available, from simple sanitary inspections and risk matrices to more complex ones such as quantitative microbial risk assessment (QMRA) (WHO, 2016b). The QMRA approach can provide relative estimates of microbial risks at drinking water treatment plants (DWTPs), which may be particularly useful to prioritize investments in improving water treatment or in implementing source water protection measures. However, in many situations, data on pathogen occurrence and concentrations in raw water are not available at DWTPs, and only faecal indicator bacteria (FIB) are measured to characterize source water quality. Therefore, to support the implementation of a source-to-tap approach, simplified classification methods, known as “bin classification” were developed to determine minimum treatment requirements according to a specific level of FIB concentrations in raw water. Different summary statistics (e.g., mean concentration, maximum concentration) and sampling strategies (e.g., weekly sampling, monthly sampling, event-based sampling) are specified in regulatory requirements worldwide for bin classification (Table A1).

The arithmetic mean is a valid metric for characterizing microbial concentrations in order to characterize the risk of multiple exposures to low doses of pathogens (Haas, 1996). The annual mean is usually considered in QMRA because annual health-based targets are recommended in guidelines and regulations (Sinclair et al., 2015). Precise estimation of the mean is challenging for surface water because microbial concentrations can vary over several orders of magnitude within hours or days (Burnet et al., 2019a). This metric relies on the law of large numbers; as the sample size grows, its sample mean gets closer to the true mean. However, the meaning of “large” depends on the distribution of the data. The convergence is much faster for normal or thin-tailed distribution than for heavy-tailed distributions. If the variance is very large, any new observation can be large enough to overwhelm all previous observations, regardless of the number of accumulated observations.

Numerous studies have shown that heavy rainfall can rapidly increase microbial contamination loads in water (Atherholt et al., 1998, Burnet et al., 2014, 2019a; Kistemann et al., 2002; Signor et al., 2005). In urban areas, combined sewer overflow (CSO) discharges induced by heavy rainfall or snowmelt events can cause recurring microbial peaks in raw water at DWTPs (Jalliffier-Verne et al., 2016; Madoux-Humery et al., 2016). In agricultural areas, similar heavy rainfall episodes can increase microbial contamination of surface waters as a result of overland transport, tile drainage systems and resuspension from stream sediments (e.g., Dorner et al., 2006). A statistical approach was proposed to incorporate such peak events in the risk assessment for a hypothetical frequency of occurrence (Petterson et al., 2006). However, methods for the estimation of peak event frequency have not been proposed yet. Stochastic models are used in other fields to evaluate the frequency of extreme precipitation events (Katz et al., 2002), streamflow peaks (Katz et al., 2002), and extreme pollution from runoff (Harremoës, 1988). These models have implications for quantifying the frequency of extreme events at DWTPs but have not been utilized in the context of microbial safety of drinking water.

In Quebec, Canada, raw water E. coli concentrations are measured since 2013 at least weekly for large DWTPs (>10,000 inhabitants). These extensive datasets provide a unique opportunity to study temporal variations in different catchments. The objective of the study was to first develop a methodology to correctly estimate the mean E. coli concentrations in surface drinking water sources by considering peak events. Large routine monitoring datasets from six DWTPs were fitted with parametric distributions having different upper tail behaviors. For the best-fit distributions, we then evaluated the required minimum sample size to estimate the mean concentration for different ranges of uncertainty. Secondly, key contributors to the mean concentration level were identified by examining the influence of seasonality and hydrometeorological factors on temporal variations. Finally, we conducted event-based sampling during two hydrometeorological events at an urban DWTP to evaluate whether the selected parametric distributions predicted these targeted peak events. Implications for the development of sampling strategies and probabilistic models are discussed for setting health-based drinking water treatment requirements.

Section snippets

Study sites

Six DWTPs fed by rivers located in urban and agricultural catchments were selected and classified by the mean annual river flow rate in ascending order from A to D (Table 1). DWTP C2 is located downstream DWTP C1. Wastewater treatment plants (WWTPs), CSO discharge points, and the dominant land cover type were identified for areas 15 km upstream of the drinking water intakes. In Quebec, CSOs are equipped with recording devices to measure the frequency and the daily cumulative duration of

Descriptive statistics

The sample mean of E. coli concentrations in raw water varied between DWTPs, from 22 to 507 E. coli/100 mL (Table 3). Overall, the mean and the mean absolute deviation (MAD) decreased with the mean flow rate of the river. A 0.2 log10 increase in the mean and MAD was observed between DWTP C1 and C2. DWTPs B, C2, and E displayed the highest SD to MAD ratio. The kurtosis was greater than 155 at DWTPs C2 and E, but was only 25 at DWTP B, indicating potential bimodality of the empirical distribution.

Optimizing distribution selection to describe E. coli variations in source water

Candidate parametric distributions were selected to fit raw water E. coli measurements. The following underlying generative processes of distributions were considered for the selection. The combination of small-scale processes at a higher aggregate scale tends to yield a common probability distribution consistent with given constraints that maximize the entropy (Frank, 2009). A maximization of entropy with a constraint on the arithmetic mean results in an exponential distribution. If the

Conclusions

We have shown that it is possible to use simple parametric models and graphical tools to consider different tail behaviors for the evaluation of the mean E. coli concentration in raw water. The application of this approach to large data sets collected with routine and event-based monitoring strategies at six drinking water treatment plants located in different types of catchments demonstrated that:

  • Weekly sampling for two years in urban catchments and for four years in agricultural catchments

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References (55)

  • American Public Health Association

    Standard Methods for the Examination of Water and Wastewater

    (2005)
  • T. Atherholt et al.

    Effect of rainfall on Giardia and Crypto

    J. Am. Water Works Assoc.

    (1998)
  • K.P. Burnham et al.

    Multimodel inference: understanding AIC and BIC in model selection

    Sociol. Methods Res.

    (2004)
  • Canadian Council of Ministers of the Environment

    Canada-wide Strategy for the Management of Municipal Wastewater Effluent

    (2007)
  • F.C. Curriero et al.

    The association between extreme precipitation and waterborne disease outbreaks in the United States, 1948–1994

    Am. J. Public Health

    (2001)
  • S.M. Dorner et al.

    Hydrologic modeling of pathogen fate and transport

    Environ. Sci. Technol.

    (2006)
  • S.A. Frank

    The common patterns of nature

    J. Evol. Biol.

    (2009)
  • S. Frank

    How to read probability distributions as statements about process

    Entropy

    (2014)
  • A. Gelman et al.

    Inference from simulations and monitoring convergence

    Handbook of markov chain monte carlo

    (2011)
  • A. Gelman et al.

    Bayesian Data Analysis

    (2013)
  • Gouvernement du Québec

    Règlement sur les ouvrages municipaux d’assainissement des eaux usées

    (2015)
  • Gouvernement du Québec

    Règlement sur le prélèvement des eaux et leur protection

    (2018)
  • C.N. Haas

    Importance of distributional form in characterizing inputs to Monte Carlo risk assessments

    Risk Anal.

    (1997)
  • S.E. Hrudey et al.

    Walkerton: lessons learned in comparison with waterborne outbreaks in the developed world

    J. Environ. Eng. Sci.

    (2002)
  • T. Kistemann et al.

    Microbial load of drinking water reservoir tributaries during extreme rainfall and runoff

    Appl. Environ. Microbiol.

    (2002)
  • I. Koponen

    Analytic approach to the problem of convergence of truncated Lévy flights towards the Gaussian stochastic process

    Phy. Rev. E

    (1995)
  • J. Koschelnik et al.

    Rapid analysis of β-D-glucuronidase activity in water using fully automated technology

    Water Resour. Manag.

    (2015)
  • Cited by (19)

    • Fecal coliform distribution and health risk assessment in surface water in an urban-intensive catchment

      2022, Journal of Hydrology
      Citation Excerpt :

      The model has been applied by many researchers (Haas et al., 2000; Machdar et al., 2013; Murphy et al., 2016; Uprety et al., 2020; van Lieverloo et al., 2007) to the assessment of E. coli, particularly E. coli O157:H7. However, the application of QMRA based on surface water quality still needs to be further improved because microbial concentrations can vary by several orders of magnitude within hours or days so that any previous observations can be overthrown by new values (Burnet et al., 2019; Sylvestre et al., 2020). In terms of risk representation, the probability of annual illness risk (Pill_year) computed by the QMRA model could not provide conclusive evidence of the occurrence of health damage or the severity and duration of illnesses (Gao et al., 2015).

    • Using surrogate data to assess risks associated with microbial peak events in source water at drinking water treatment plants

      2021, Water Research
      Citation Excerpt :

      The parameters of the lognormal distributions were inferred from E. coli data collected weekly for 5 years (2013–2017) and Cryptosporidium data collected monthly for approximately 2 years between 2012 and 2016. Estimations and inferences were carried out in a Bayesian framework using Markov Chain Monte Carlo (MCMC), as previously described (Sylvestre et al., 2020a,c). EPs were computed for daily mean source water Cryptosporidium concentrations adjusted for the recovery efficiency and daily mean source water E. coli concentrations.

    View all citing articles on Scopus
    View full text