Elsevier

Water Research

Volume 184, 1 October 2020, 116103
Water Research

An integrated approach based on virtual data augmentation and deep neural networks modeling for VFA production prediction in anaerobic fermentation process

https://doi.org/10.1016/j.watres.2020.116103Get rights and content

Highlights

  • An integrated approach, named RSDS-DNNs, was proposed for VFA production prediction.

  • RSDS method was established to augment virtual data for DNNs models.

  • Well-trained DNNs by virtual data achieved high accuracy.

  • The RSDS-DNNs method can enlarge to simulate bioprocess with small datasets.

Abstract

Data-driven models are suitable for simulating biological wastewater treatment processes with complex intrinsic mechanisms. However, raw data collected in the early stage of biological experiments are normally not enough to train data-driven models. In this study, an integrated modeling approach incorporating the random standard deviation sampling (RSDS) method and deep neural networks (DNNs) models, was established to predict volatile fatty acid (VFA) production in the anaerobic fermentation process. The RSDS method based on the mean values (x¯) and standard deviations (α) calculated from multiple experimental determination was initially developed for virtual data augmentation. The DNNs models were then established to learn features from virtual data and predict VFA production. The results showed that when 20000 virtual samples including five input variables of the anaerobic fermentation process were used to train the DNNs model with 16 hidden layers and 100 hidden neurons in each layer, the best correlation coefficient of 0.998 and the minimal mean absolute percentage error of 3.28% were achieved. This integrated approach can learn nonlinear information from virtual data generated by the RSDS method, and consequently enlarge the application range of DNNs models in simulating biological wastewater treatment processes with small datasets.

Introduction

The boundary between wastes and resources is becoming increasingly illegible due to the rapid technical development of resource recovery from diverse wastes. In wastewater treatment fields, the demand for a green recyclable economy has prompted the transition from traditional wastewater treatment plants (WWTPs) to water resource recovery facilities (Regmi et al., 2019). Anaerobic fermentation, one of the most widely studied resource recovery processes, has multiple functions such as yielding biogas for energy supply (Asadi et al., 2019), producing volatile fatty acid (VFA) for bioplastic synthesis (Coats et al., 2016; Luo et al., 2020b) or biological nitrogen removal (Yuan et al., 2019). Recent discoveries about VFA production and synchronous in situ phosphorus recovery as vivianite in anaerobic fermentation processes have opened up new avenues for resource recovery (Cao et al., 2019; Wu et al., 2020b).

To optimize anaerobic fermentation for a high yield of valuable bio-products, various mathematical models have been developed to simulate anaerobic fermentation processes and predict the production of hydrogen or VFA (Asadi et al., 2019; Hinken et al., 2014; Luo et al., 2020a; Mu et al., 2007; Nair et al., 2016; Yogeswari et al., 2019). However, complex biochemical processes involving multiple reactions simultaneously and non-linear relationships among anaerobic process variables make it difficult to establish an accurate mechanistic model (i.e., Anaerobic Digestion Model No. 1) for anaerobic fermentation processes (Batstone et al., 2002; Xie et al., 2016).

In addition to mechanistic models, artificial neural networks (ANNs) models (e.g., feedforward backpropagation neural networks and adaptive network-based fuzzy inference system) have also been applied to forecast bioproduction in anaerobic fermentation systems (Asadi et al., 2019; Oloko-Oba et al., 2018; Wang et al., 2019; Yogeswari et al., 2019). ANNs models can predict output variables based on the relationships between input and output variables without considering internal mechanisms explicitly (LeCun et al., 2015; Qi and Majda, 2019; Reichstein et al., 2019). Various parameters involved in anaerobic fermentation including pH, moisture content, total volatile solids, VFA, total chemical oxygen demand, soluble chemical oxygen demand (COD), the ratio of carbon to nitrogen, temperature, and retention time, can be chosen as input variables of ANNs models (Nair et al., 2016; Najafi and Faizollahzadeh Ardabili, 2018; Zhao et al., 2019). Then, the relationships between output variables (i.e., production of methane or hydrogen) and the selected input variables were identified from the prepared training datasets by ANNs models (Antwi et al., 2017; Yogeswari et al., 2019). However, due to the weak generalization ability resulted from shallow structure of ANNs models, they may not be able to learn the entire knowledge within the complex anaerobic fermentation process (Goodfellow et al., 2016). Compared with shallow ANNs models, deep neural networks (DNNs) models have deeper structures with more hidden layers and hidden neurons and consequently can learn more concrete features from datasets (Goodfellow et al., 2016; LeCun et al., 2015). It is well known that training DNNs models requires large datasets. For instance, to predict the sorption performance of a wide range of carbonaceous materials, massive data from the literature (from 2005 to 2019) had been collected to train DNNs models (Sigmund et al., 2020). To predict total Kjeldahl nitrogen level in the wastewater treatment plant, data of 609 days from the benchmark simulation model no. 2 were utilized to train their semi-supervised deep neural regression network (Yan et al., 2020). However, due to the high cost and time-consuming experiments, it is difficult to obtain large datasets in biological wastewater treatment processes, especially in novel bioprocesses and initial experimental attempts. Therefore, the shortage of data limits the application of data-driven models in biological wastewater treatment processes with small datasets.

To this end, researchers attempted to develop several artificial data augmentation approaches for solving small dataset problems in data-driven models (Abdul Lateh et al., 2017). For example, a structure-based data transformation method was proposed to separate datasets into clusters and dynamically generate the number of clusters (Li et al., 2012). Moreover, a particle swarm optimization based virtual sample generation approach was developed to generate feasible virtual samples over the search-space to improve the accuracy of forecasting models (Chen et al., 2017). However, these data generation procedures are complex and cannot make full use of the data obtained from repeated determination experiments. Therefore, new virtual data augmentation methods should be developed to overcome the data limitations and broaden the application range of data-driven models in biological wastewater treatment processes.

In this study, an integrated modeling approach composed of a new virtual data augmentation method, named random standard deviation sampling (RSDS) method, and DNNs models was established to simulate VFA production in an anaerobic fermentation system. The data trends of VFA production and relevant parameters were demonstrated firstly. Then, the RSDS method based on the mean value and standard deviation calculated from experimental measurements performed in triplicates was developed to provide sufficient data for data-driven models. After that, the accuracy of established RSDS-DNNs models with virtual data was evaluated by comparing with the results of ANNs models. The effects of virtual dataset size on training RSDS-DNNs models were also analyzed. Finally, practical implications of the new RSDS-DNNs approach in simulating wastewater treatment processes by data-driven models were highlighted.

Section snippets

Experimental setup and data origin

Waste activated sludge (WAS) and food waste (FW) were put into glass bottles with 600 mL working volume for anaerobic co-fermentation. FeCl3 was selected as the ferric iron source and dosed into 300-mL WAS mixture to make a Fe/P (total P in the WAS) molar ratio of 1.5. Our previous studies demonstrated that the addition of FeCl3 and FW could synergistically increase the activity of hydrolytic (protease and α-glucosidase) and acid-forming enzymes (phosphotransacetylase and acetate kinase), which

VFA production coupled with Fe3+ reduction and PO43− release

The variations of PO43−, Fe2+ and VFA concentrations during anaerobic fermentation experiments at pH = 3–5 are presented in Fig. 2 (The data of 10% and 20% FW are not shown). pH had significant effects on the PO43− release and Fe3+ reduction, and the dosage of extra co-fermentation substrate slightly increased the release of soluble Fe2+ and PO43−. With the dosage of 30% FW, the maximal soluble Fe2+ concentration reached 584.10 mg/L on 12th day at pH = 4, and the maximal soluble PO43−

Conclusions

The integrated modeling approach composed of RSDS method and DNNs models was established to predict VFA production in the anaerobic fermentation process. With the help of virtual data generated by the RSDS method, DNNs models were successfully developed to simulate VFA production. Augmenting virtual data in a certain range could improve the performance of DNNs models while trial and error method was required to find the best balance between model structure and virtual data size. The integrated

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

This work was supported by the National Natural Science Foundation of China (51878244 and 51878243), the Fundamental Research Funds for the Central Universities (B200202101), and the Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD), China.

References (48)

  • A. Hosseinzadeh et al.

    Modeling water flux in osmotic membrane bioreactor by adaptive network-based fuzzy inference system and artificial neural network

    Bioresour. Technol.

    (2020)
  • D.-C. Li et al.

    Using structure-based data transformation method to improve prediction accuracies for small data sets

    Decis. Support Syst.

    (2012)
  • H. Liu et al.

    Acidogenic fermentation of proteinaceous sewage sludge: effect of pH

    Water Res.

    (2012)
  • J. Luo et al.

    Potential influences of exogenous pollutants occurred in waste activated sludge on anaerobic digestion: a review

    J. Hazard Mater.

    (2020)
  • J. Luo et al.

    Promotion of short-chain fatty acids production and fermented sludge properties via persulfate treatments with different activators: performance and mechanisms

    Bioresour. Technol.

    (2020)
  • A. MacAllister et al.

    Using high-fidelity meta-models to improve performance of small dataset trained Bayesian Networks

    Expert Syst. Appl.

    (2020)
  • Y. Mu et al.

    A kinetic approach to anaerobic hydrogen-producing process

    Water Res.

    (2007)
  • V.V. Nair et al.

    Artificial neural network based modeling to evaluate methane yield from biogas in a laboratory-scale anaerobic bioreactor

    Bioresour. Technol.

    (2016)
  • B. Najafi et al.

    Application of ANFIS, ANN, and logistic methods in estimating biogas production from spent mushroom compost (SMC)

    Resour. Conserv. Recycl.

    (2018)
  • M.I. Oloko-Oba et al.

    Performance evaluation of three different-shaped bio-digesters for biogas production and optimization by artificial neural network integrated with genetic algorithm

    Sustainable Energy Technologies and Assessments

    (2018)
  • J. Schmidhuber

    Deep learning in neural networks: an overview

    Neural Network.

    (2015)
  • X. Wang et al.

    The link of feast-phase dissolved oxygen (DO) with substrate competition and microbial selection in PHA production

    Water Res.

    (2017)
  • Y. Wu et al.

    Continuous waste activated sludge and food waste co-fermentation for synchronously recovering vivianite and volatile fatty acids at different sludge retention times: performance and microbial response

    Bioresour. Technol.

    (2020)
  • Y. Wu et al.

    A novel approach of synchronously recovering phosphorus as vivianite and volatile fatty acids during waste activated sludge and food waste co-fermentation: performance and mechanisms

    Bioresour. Technol.

    (2020)
  • Cited by (41)

    View all citing articles on Scopus
    View full text