Skip to main content
Log in

Development of gradient boosting-assisted machine learning data-driven model for free chlorine residual prediction

  • Research Article
  • Published:
Frontiers of Environmental Science & Engineering Aims and scope Submit manuscript

Abstract

Chlorine-based disinfection is ubiquitous in conventional drinking water treatment (DWT) and serves to mitigate threats of acute microbial disease caused by pathogens that may be present in source water. An important index of disinfection efficiency is the free chlorine residual (FCR), a regulated disinfection parameter in the US that indirectly measures disinfectant power for prevention of microbial recontamination during DWT and distribution. This work demonstrates how machine learning (ML) can be implemented to improve FCR forecasting when supplied with water quality data from a real, full-scale chlorine disinfection system in Georgia, USA. More precisely, a gradient-boosting ML method (CatBoost) was developed from a full year of DWT plant-generated chlorine disinfection data, including water quality parameters (e.g., temperature, turbidity, pH) and operational process data (e.g., flowrates), to predict FCR. Four gradient-boosting models were implemented, with the highest performance achieving a coefficient of determination, R2, of 0.937. Values that provide explanations using Shapley’s additive method were used to interpret the model’s results, uncovering that standard DWT operating parameters, although non-intuitive and theoretically non-causal, vastly improved prediction performance. These results provide a base case for data-driven DWT disinfection supervision and suggest process monitoring methods to provide better information to plant operators for implementation of safe chlorine dosing to maintain optimum FCR.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Data Accessibility Statement The data and code that support the findings of this study are available from the corresponding author, Prof. Yongsheng Chen, upon reasonable request.

References

  • Abdullah M P, Yee L F, Ata S, Abdullah A, Ishak B, Abidin K N Z (2009). The study of interrelationship between raw water quality parameters, chlorine demand and the formation of disinfection byproducts. Physics and Chemistry of the Earth Parts A/B/C, 34(13–16): 806–811

    Article  Google Scholar 

  • André Felipe L, Fábio Cosme Rodrigues Dos S, Cleber Gustavo D (2018). Artificial neural networks to control chlorine dosing in a water treatment plant. Acta Scientiarum. Technology, 40(1): 1–9

    Google Scholar 

  • Boulos P F (2017). Optimal scheduling of pipe replacement. Journal-American Water Works Association, 109(1): 42–46

    Article  Google Scholar 

  • Buysschaert B, Vermijs L, Naka A, Boon N, De Gusseme B (2018). Online flow cytometric monitoring of microbial water quality in a full-scale water treatment plant. npj Clean Water, 1(1): 16

    Article  Google Scholar 

  • Clark R M, Sivaganesan M (2002). Predicting chlorine residuals in drinking water: second order model. Journal of Water Resources Planning and Management, 128(2): 152–161

    Article  Google Scholar 

  • Crider Y, Sultana S, Unicomb L, Davis J, Luby S P, Pickering A J (2018). Can you taste it? Taste detection and acceptability thresholds for chlorine residual in drinking water in Dhaka, Bangladesh. Science of the Total Environment, 613–614: 840–846

    Article  Google Scholar 

  • Delpla I, Jung A V, Baures E, Clement M, Thomas O (2009). Impacts of climate change on surface water quality in relation to drinking water production. Environment International, 35(8): 1225–1233

    Article  CAS  Google Scholar 

  • Di Nardo A, Di Natale M, Greco R, Santonastaso G F (2014). Ant algorithm for smart water network partitioning. Procedia Engineering, 70: 525–534

    Article  Google Scholar 

  • Fish K, Osborn A M, Boxall J B (2017). Biofilm structures (EPS and bacterial communities) in drinking water distribution systems are conditioned by hydraulics and influence discolouration. Science of the Total Environment, 593–594: 571–580

    Article  Google Scholar 

  • Frateur I, Deslouis C, Kiene L, Levi Y, Tribollet B (1999). Free chlorine consumption induced by cast iron corrosion in drinking water distribution systems. Water Research, 33(8): 1781–1790

    Article  CAS  Google Scholar 

  • Fujioka T, Hoang A T, Aizawa H, Ashiba H, Fujimaki M, Leddy M (2018). Real-time online monitoring for assessing removal of bacteria by reverse osmosis. Environmental Science & Technology Letters, 5(6): 389–393

    Article  CAS  Google Scholar 

  • Gagnon G A, Rand J L, O’leary K C, Rygel A C, Chauret C, Andrews R C (2005). Disinfectant efficacy of chlorite and chlorine dioxide in drinking water biofilms. Water Research, 39(9): 1809–1817

    Article  CAS  Google Scholar 

  • Gang D C, Clevenger T E, Banerji K S (2003). Modeling chlorine decay in surface water. Journal of Environmental Informatics, 1(1): 21–27

    Article  Google Scholar 

  • Gao H, Zhong S, Zhang W, Igou T, Berger E, Reid E, Zhao Y, Lambeth D, Gan L, Afolabi M A, Tong Z, Lan G, Chen Y (2022). Revolutionizing membrane design using machine learning-Bayesian optimization. Environmental Science & Technology, 56(4): 2572–2581

    Article  CAS  Google Scholar 

  • Gray M J, Wholey W Y, Jakob U (2013). Bacterial responses to reactive chlorine species. Annual Review of Microbiology, 67(1): 141–160

    Article  CAS  Google Scholar 

  • Holzinger A, Goebel R, Fong R, Moon T, Müller K R, Samek W (2022). xxAI-beyond explainable artificial intelligence. In: Proceedings of International Workshop on Extending Explainable AI Beyond Deep Models and Classifiers, Vienna, Austria, 18 July 2020. Cham: Springer, 3–10

    Google Scholar 

  • Hsu L H H, Hoque E, Kruse P, Ravi Selvaganapathy P (2015). A carbon nanotube based resettable sensor for measuring free chlorine in drinking water. Applied Physics Letters, 106(6): 063102

    Article  Google Scholar 

  • Li L, Rong S, Wang R, Yu S (2021). Recent advances in artificial intelligence and machine learning for nonlinear relationship analysis and process control in drinking water treatment: a review. Chemical Engineering Journal, 405: 126673

    Article  CAS  Google Scholar 

  • Liu X (2016). Methods and Applications of Longitudinal Data Analysis. Oxford: Academic Press, 441–473

    Book  Google Scholar 

  • Lowe M, Qin R, Mao X (2022). A review on machine learning, artificial intelligence, and smart technology in water treatment and monitoring. Water (Basel), 14(9): 1384–1411

    CAS  Google Scholar 

  • Mac Kenzie W R, Hoxie N J, Proctor M E, Gradus M S, Blair K A, Peterson D E, Kazmierczak J J, Addiss D G, Fox K R, Rose J B, et al. (1994). A massive outbreak in Milwaukee of Cryptosporidium infection transmitted through the public water supply. New England Journal of Medicine, 331(3): 161–167

    Article  CAS  Google Scholar 

  • Merrick L, Taly A (2020). The Explanation Game: Explaining Machine Learning Models Using Shapley Values. Cham: Springer International Publishing, 17–38

    Google Scholar 

  • Onyutha C, Kwio-Tamale J C (2022). Modelling chlorine residuals in drinking water: a review. International Journal of Environmental Science and Technology, 19(11): 11613–11630

    Article  Google Scholar 

  • Powell J C, Hallam N B, West J R, Forster C F, Simms J (2000). Factors which control bulk chlorine decay rates. Water Research, 34(1): 117–126

    Article  CAS  Google Scholar 

  • Reid E, Igou T, Zhao Y, Crittenden J, Huang C H, Westerhoff P, Rittmann B, Drewes J E, Chen Y (2023). The minus approach can redefine the standard of practice of drinking water treatment. Environmental Science & Technology, 57(18): 7150–7161

    Article  CAS  Google Scholar 

  • Richardson S D, Kimura S Y (2020). Water analysis: emerging contaminants and current issues. Analytical Chemistry, 92(1): 473–505

    Article  CAS  Google Scholar 

  • Rittmann B E, Snoeyink V L (1984). Achieving biologically stable drinking water. Journal–American Water Works Association, 76(10): 106–114

    Article  CAS  Google Scholar 

  • Romano M, Kapelan Z, Savić D A (2014). Automated detection of pipe bursts and other events in water distribution systems. Journal of Water Resources Planning and Management, 140(4): 457–467

    Article  Google Scholar 

  • Saboe D, Hristovski K D, Burge S R, Burge R G, Taylor E, Hoffman D A (2021). Measurement of free chlorine levels in water using potentiometric responses of biofilms and applications for monitoring and managing the quality of potable water. Science of the Total Environment, 766: 144424

    Article  CAS  Google Scholar 

  • Sedlak D L, Von Gunten U (2011). The chlorine dilemma. Science, 331(6013): 42–43

    Article  CAS  Google Scholar 

  • Smeets P W M H, Medema G J, Van Dijk J C (2009). The Dutch secret: How to provide safe drinking water without chlorine in the Netherlands? Drinking Water Engineering and Science, 2(1): 1–14

    Article  CAS  Google Scholar 

  • Suffet I H, Corado A, Chou D, Mcguire M J, Butterworth S (1996). AWWA taste and odor survey. Journal-American Water Works Association, 88(4): 168–180

    Article  CAS  Google Scholar 

  • Sundararajan M, Najmi A (2020). The many Shapley values for model explanation. In: Hal D III, Aarti S, editors. Proceedings of Machine Learning Research. Brookline, MA, USA: 119, 9269-9278

  • Tinelli S, Juran I (2019). Artificial intelligence-based monitoring system of water quality parameters for early detection of nonspecific bio-contamination in water distribution systems. Water Science and Technology: Water Supply, 19(6): 1785–1792

    Google Scholar 

  • Valdivia-Garcia M, Weir P, Graham D W, Werner D (2019). Predicted impact of climate change on trihalomethanes formation in drinking water treatment. Scientific Reports, 9(1): 9967

    Article  Google Scholar 

  • Warton B, Heitz A, Joll C, Kagi R (2006). A new method for calculation of the chlorine demand of natural and treated waters. Water Research, 40(15): 2877–2884

    Article  CAS  Google Scholar 

  • Wilson R E, Stoianov I, O’hare D (2019). Continuous chlorine detection in drinking water and a review of new detection methods. Johnson Matthey Technology Review, 63(2): 103–118

    Article  CAS  Google Scholar 

  • World Health Organization (2017). Principles and Practices of Drinking-Water Chlorination: a Guide to Strengthening Chlorination Practices in Small to Medium Sized Water Supplies. New Delhi: World Health Organization Regional Office for South-East Asia

    Google Scholar 

  • Zhang B, Kotsalis G, Khan J, Xiong Z, Igou T, Lan G, Chen Y (2020a). Backwash sequence optimization of a pilot-scale ultrafiltration membrane system using data-driven modeling for parameter forecasting. Journal of Membrane Science, 612: 118464

    Article  CAS  Google Scholar 

  • Zhang K, Zhong S, Zhang H (2020b). Predicting aqueous adsorption of organic compounds onto biochars, carbon nanotubes, granular activated carbons, and resins with machine learning. Environmental Science & Technology, 54(11): 7008–7018

    Article  CAS  Google Scholar 

  • Zhong S, Lambeth D R, Igou T K, Chen Y (2022). Enlarging applicability domain of quantitative structure-activity relationship models through uncertainty-based active learning. ACS ES&T Engineering, 2(7): 1211–1220

    Article  CAS  Google Scholar 

  • Zhong S, Zhang K, Bagheri M, Burken J G, Gu A, Li B, Ma X, Marrone B L, Ren Z J, Schrier J, et al. (2021). Machine learning: new ideas and tools in environmental science and engineering. Environmental Science & Technology, 55(19): 12741–12754

    CAS  Google Scholar 

Download references

Acknowledgements

This research was partially supported by: US Department of Agriculture’s National Institute of Food and Agriculture, Agriculture and Food Research Initiative, Water for Food Production Systems (No. 2018-68011-28371); National Science Foundation (USA) (Nos. 1936928, 2112533); US Department of Agriculture’ National Institute of Food and Agriculture (No. 2020-67021-31526); and US Environmental Protection Agency (No. 840080010).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yongsheng Chen.

Ethics declarations

Conflict of Interest The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Additional information

Highlights

• A machine learning approach was applied to predict free chlorine residuals.

• Annual data were obtained from chlorination unit at a 98 MGD water treatment plant.

• The last model iteration returned a high prediction value (R2 = 0.937).

• Non-intuitive parameters were found to be highly significant to predictions.

Supporting Information

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Helm, W., Zhong, S., Reid, E. et al. Development of gradient boosting-assisted machine learning data-driven model for free chlorine residual prediction. Front. Environ. Sci. Eng. 18, 17 (2024). https://doi.org/10.1007/s11783-024-1777-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11783-024-1777-6

Keywords

Navigation