Introduction

As a result of growing human populations and climate change, aquatic environments are going through unprecedented changes. Water quality deteriorates due to climate change, damaging aquatic life1. Aquaculture, food production, environmental monitoring, and industrial production are just a few sectors that depend on dissolved oxygen (DO) concentration, which is an essential water quality indicator2. Typically, the amount of DO in water is stated in terms of milligrams per litre (mg/L). The ideal DO for high-quality water is 5–6 mg/L3. Water quality deteriorates, and there is mass fish mortality if the DO drop level falls below 2 mg/L4. Biological treatment is used in sewage and water treatment facilities via water jet oxygenation systems to effectively manage and regulate human and animal waste5. Numerous approaches, including physical procedures like adsorption and membrane filtration, chemical procedures like Fenton oxidation and electrochemical oxidation, and numerous biological methods, have been developed to treat wastewater6. Alternative options for raising water quality include hydraulic structures like stepped spillways, nozzle orifices, or free overflow structures. Weirs, venturi aerators, stepped cascades, and stepped spillways can all increase the amount of dissolved oxygen in a river flow system. Drop structures, including baffle blocks, chutes, weirs, and cascades, are frequently employed in straight-flow canals. Recently, research has been conducted to study the air/water flow ratio (Qa/Qw) and E20 in different hydraulic structures7,8,9. Many researchers have been conducted to study air entrainment by plunging water jets. Experimental studies on air entrainment by plunging water jets were carried out by researchers10,11,12. Many researchers have studied the increase in dissolved oxygen in weirs. Gameson13 was the first to document the river weir's potential for aeration. Since then, several laboratory studies on weir aeration have been conducted14,15,16,17,18,19. The research has also been conducted on Parshall and venturi flumes, improving DO concentration. A Parshall flume is a dynamic instrument because of its wide range of uses in wastewater treatment plants, mine discharge, irrigation canals, and dam seepage20. Many researchers conducted experiments on E20 at Parshall flumes21,22.

Several natural instances include plunging jets from rectangular weirs, sluices, and other comparable water systems that oxygenate or capture oxygen from the air to purify falling or running water.

The instant rate of change in DO concentration (\(dC/dt\)) is given by the Eq. (1) as follows:

$$\frac{dc}{dt}={K}_{L}\frac{{A}_{S}}{V}({C}_{Sat}-C),$$
(1)

where the saturation and DO concentrations, respectively, are \({C}_{Sat}\) and \(C\). Oxygen's fluid film coefficient is \({K}_{L}\), where \({A}_{s}\) is surface-area and V is volume.

The water-atmospheric partition is used for the prediction relating to the \({C}_{Sat}\). In the event that the presumption is accurate, \({C}_{Sat}\) stays steady throughout the time, and E (oxygen transfer efficiency) can be calculated as follows:

$$E=\frac{{C}_{down}-{C}_{up}}{{C}_{sat}-{C}_{up}}.$$
(2)

The subscripts ‘up’ and ‘down’ indicate up-stream and down-stream locations of the jet screen, respectively.

The ratio of oxygen transferred to water to oxygen that could theoretically be ejected into the water in ideal conditions is known as aeration efficiency. The aeration efficiency (E20) is 100% or one when all the oxygen that might possibly be transported to the water is actually transferred. When no dissolved oxygen is transferred, E20 is zero. The following equation of the correction factor is used to preserve uniformity in measured experiments and standardise the results acquired at various temperatures to 20 °C. The adjustment factor accounts for the variations in how soluble oxygen is in water at various temperatures are presented in Eqs. (3) and (4)23. The experiments were performed within the water temperature (T) range of 23 °C–25 °C.

$$1-{E}_{20}={(1-E)}^{1/f}$$
(3)

Following is the formula for calculating the aeration exponent, \(f\), and the oxygen transfer efficacy at 20 °C, E20:

$$f=1+2.1\times {10}^{-2}\times \left(T-20\right)+8.25\times {10}^{-5}\times ({T-20)}^{2}.$$
(4)

Several researchers have studied the oxygen diffusion between air and water caused by falling jets24. It was discovered that the impact angle had very little effect on the volumetric oxygen transfer coefficient25. The air/water oxygen transfer in the biological aerated filter was studied26. The liquid properties that affected the speeds at which oxygen and air were carried in plunging jet reactors were examined27. Multiple falling jets for oxygen transport were described by Deswal and Verma28. Chanson and Brattberg29 researched air entrainment via a two-dimensional plunging jet, while Deswal and Verma30 investigated air/water oxygen transfer in an aerated biological filter. The authors have demonstrated in the experiments5,31 that nozzle shapes, or jet geometry, affect air absorption and oxygen transport.

Hydraulics research has historically been conducted using experimental formulas, mathematical models, and physical tests. These tests are simple, but they take a lot of time and often yield inaccurate results. Solutions to problems faced in hydraulics engineering, such as predicting aeration efficiency, have emerged with the advent of soft computing. Soft computing models have drawn a lot of attention in engineering32,33,34,35,36 because they can use historical data to learn the complex correlations between different factors and then use that information to generate precise predictions on new data. The topic of aeration has demonstrated the usefulness of soft computing techniques. Adaptive neurofuzzy inference system (ANFIS) and least square support vector machines have been successfully applied by Baylar et al.37 to data sets of air-entraining rate and aeration efficiency obtained from descending overfall jet from triangular-weir. Multiple linear and multiple nonlinear regression-based predictive equations were employed to compare the efficacy of various modelling techniques. Bagatur and Onen38 investigated the ability of genetic expression programming (GEP)as a substitute to forecast the design coefficient in an ogee-crested spillway. Support vector machines (SVM) and GEP techniques were used by the authors to correctly forecast the volumetric oxygen transfer coefficient of numerous plunging jets descending into a still water pool39. To predict the volumetric oxygen transfer coefficient by vertical and angled multiple jets, GEP modelling was utilized to assess the kernel functions based on support vector and multi linear regressions40. Using ANN (artificial neural network) and nonlinear regression techniques, Kramer et al.41 successfully evaluated the penetration depth of plunging water jets with extended discharge. Kumar et al.42 predicted volumetric oxygen transfer coefficient with soft computing models such as ANN, ANFIS, multiple non-linear regression, multivariate adaptive regression splines, and generalized regression neural network. ANFIS with bell-shaped membership function and ANN were found to be better when compared to other models. In a study, the efficacy of soft computing approaches such as SVM, M5P, and multiple non-linear regression was estimated for the prediction of volumetric oxygen transfer coefficient. The experimental tests were performed on hollow jet aerators with different jet plunging angles i.e., 30°, 45°, and 60°. The results indicated that SVM was the best model among other regression models43. In the current work, experiments are carried out to study the aeration efficiency of plunging jets fabricated from acrylic sheets. The hydraulics lab's tilting flume equipment was used for the experiments. As far as authors are aware, significantly less literature is available on jet aeration in open channel water flow. None of the studies utilizes a range of the aforementioned input parameters to investigate the aeration efficiency.

Null Hypothesis (H0):

Input variables considered in the present study such as θ (°), Q (L/s), Jn (Number), HRJn (cm), and Fr, do not have effect on output variable, E20.

Alternate Hypothesis (HA):

The aforementioned input variables have significant effect on E20.

Thus, the study is innovative and highlights the following goals:

  • Impact of input parameters such as such as θ (°), Q (L/s), Jn (Number), HRJn (cm), and Fron output variable, E20.

  • Prediction of E20 with various soft computing techniques, ANN, M5P, and RF.

  • Sensitivity analysis to ascertain the consequences of each variable onE20.

Soft computing techniques

The following section shows the soft computing techniques that were modelled to predict E20 in the current study.

Artificial neural networks

The first ANN was established in the field of biology, where the structure and function of biological neurons and neural networks served as the inspiration for the design of these computer systems. While "network" in ANN relates to the interrelated framework of such neurons in biological systems, "neural" in ANN pertains to a neuron. An ANN comprises unified artificial neurons set up to resemble the characteristics of natural neurons. These neurons work together to solve a particular problem. The ANN design incorporates many user-defined features that are customized with machine-learning models. For a realistic ANN network, utilise trial and error. The prediction equation is hidden via black-box approaches. Notes show how often the layers provide data to the network. Epochs are training data cycles44,45. ANNs have a training period that expands exponentially as dataset size does. one or more hidden layers with computational neurons that improve and transmit the information from the preceding layer, one input layer with a prediction node, and one input layer with neurons representing input variables46,47. A network comprising biases, a sigmoid layer, and a linear output layer by an approximate finitely discontinuous function48.

Random forest

There is considerable interest in machine learning research concerning ensemble learning methods for generating many classifiers and combining their results. Many ensemble methods are widely used, including boosting bagging and, more recently, random forest (RF)49. The RF approach converts input vectors into a planned work of tree predictors using random input samples. Breiman50 devised the random forest technique, and later proved to be a highly effective all-purpose characterization and correlation tool. The parameters are selected based on the optimal split, and the technique is hit-or-miss. By capturing a collection of random trees, the RF technique creates random forests51. RF functions by combining weak classification trees and makes decisions by a majority vote, combining bagging and random subspace. The number of features will be examined to determine the optimal splitting and the number of decision trees to create (Ntree), in order to properly set splitting for the forest trees52,53. In reality, two-thirds of the training data is used to generate every tree. Performance may be calculated using the Out-of-Bag (OOB) data, the part of training samples that were not utilized. Consequently, there are N trees in the random forest regression, where N is the maximum number of trees to be created, which the user may specify to any integer. Each forest's 'n' tree is built using the CART (classification and regression trees) method without pruning. When utilizing different criteria and RF regression, the tree can be allowed to grow to the depth of all new training data. A "Gini" index is utilized to measure the degree of inaccuracy in the parameters compared to the result before selecting a training set of parameters to build specific trees. Compared to a single regression tree, a regression forest is less predictive. The training dataset is critical when a single tree splits into a single criterion. Minor adjustments to the dataset and splitting criteria may prime different tree topologies, resulting in different conclusions. RF models categorize the variables according to their relevance to create the best RF model50.

M5P

Quinlan54 developed the M5P algorithm, which has the advantage of being able to handle large data with many traits efficiently. Additionally, they can handle inaccurate information without introducing any uncertainty. This tree approach classifies or divides diverse data areas into several sub-spaces at the terminal area, then enforces a linear regression on each multivariate linear regression model sub-location. The M5P tree is constructed in two steps. A splitting method is used to build a decision tree in the initial stage. The branching criteria produced by the M5P tree model approach are based on the behavioural class labels that approach a branch to measure the inaccuracy and the predicted decrease in error due to evaluating each characteristic at that node. The primary tree model may be produced owing to the separation criteria's ability to anticipate the standard deviations of class values extending to nodes. The data is cleaner because this method constructs linear functions at each node and calculates the predicted errors at the node using the standard deviation technique. For this standard deviation reduction formula (SDR) is given as:

$$SDR=sd\left(N\right)-\frac{{\sum }_{i=1}^{x}\left|{N}_{i}\right|}{\left|N\right|}*sd\left(N\right),$$
(5)

where ‘\(N\)’ is sample size, \({N}_{i}\) is the \({i}^{th}\) sample and ‘\(sd\)’ is the standard deviation.

The tree is pruned in the second stage. The final stretch excludes the marginalized branches (terminal sections) to ensure strong prediction performance. This procedure comprises selecting the components that should be trimmed based on a criterion. After being trimmed, the fresh leaves are located using the arrangement of data used in the learning procedure. This smoothing method then typically results in predictions that are better. In this and subsequent steps, a regularization technique is used to solve for irregularities in surrounding linear models in the leaves of the tree.

Performance evaluation

Correlation coefficient (CC)

One of the most often used and reported statistical techniques is the correlation coefficient (CC), also referred to as Pearson's correlation. This statistical technique is employed to estimate how closely a linear connection is related. It has a value between − 1 and + 1. The correlation is shown by the numbers − 1 for a negative correlation, + 1 for a positive correlation, and 0 for no correlation.

$$CC=\frac{\sum_{i=1}^{N}\left({k}_{i}-\overline{k }\right)\left({l}_{i}-\overline{l }\right)}{\sqrt{\sum_{i=1}^{N}{\left({k}_{i}-\overline{k }\right)}^{2}}\sqrt{\sum_{i=1}^{N}{\left({l}_{i}-\overline{l }\right)}^{2}}},$$
(6)

where, (\({k}_{i}\)) represents predicted value (\(\overline{k })\) represents mean of predicted valueand (\({l}_{i}\)) represents observed value.

Root mean square error

The sample standard deviation for the variations between real data (\(l\)) and projected values (\({k}_{i}\)), is represented by the RMSE, where "\(N\)" is the number of observations. Normal distribution errors are described by RMSE.

$${\text{RMSE}}=\sqrt{\frac{1}{N}\sum_{i=1}^{N}{\left(l-{k}_{i}\right)}^{2}}.$$
(7)

Mean absolute error

To estimate how well a prediction fits actual results, the MAE is utilised. It is assigning each error the same weight. The uniformly distributed errors are described by MAE.

$${\text{MAE}}\hspace{0.17em}=\frac{1}{N}\sum_{i=1}^{N}\left|{l}_{i}-{k}_{i}\right|.$$
(8)

Methodology

Figures 1, 2 and 3, show the experimental tests for the current investigation were carried out in a tilting flume with dimensions of 45 × 25 × 500 cm. A 2 HP electric motor was used to circulate the water in the flume. Seven interchangeable acrylic sheets with 1, 2, 4, 8, 16, 32, and 64 jets were included as the aeration device (Table 1).Each screen was evaluated for Q values of 3.41L/s, 3.84L/s, 4.75L/s. It has been found that Q is in the range of 0.1L/s–4.69L/s in the studied literature55,56,57.

Figure 1
figure 1

Tilting flume equipment experimental setup.

Figure 2
figure 2

Plunging water jet with Jn = 1.

Figure 3
figure 3

Plunging water jet with Jn = 64.

Table 1 Jets configuration (plate dimensions = 45 × 25 cm).

The discharge in the field examples was found to be in between 1L/s and 6L/s in case of aquifer systems in Bengaluru58, and 1.1L/s-8L/s as recommended by WATEX59. The values of θ considered in the current study are 0°, 1.5°, and 3°. Every acrylic sheet was positioned within the flume and adjusted such that water only enters the pool downstream through jet holes. In order to deoxygenate the tank water before the tests could begin; Sodium Sulphite (Na2SO3) and a catalyst called Cobalt Chloride (CoCl2) were introduced in the water tank. Using the azide modification method60, the initial concentration of dissolved oxygen (\({C}_{up}\)) was found in a sample of oxygen-depleted water that was taken upstream of the jet device. The next step was aeration for a predetermined period (t = 2 min). Then sample of oxygenated water was taken to estimate the concentration of dissolved oxygen in the water downstream (\({C}_{down}\)) of the aeration device after time ‘t’. A lab thermometer was used to monitor the water's temperature during the experiments. Equations (2), (3), (4) were then used to get the value of E20. The input and output data for the 63 experiments is listed in Table S1 (supplementary data). Further three soft techniques; ANN, M5P, and RF were used to predict E20. Out of the total of 63 experimentally recorded readings, the 42 readings were randomly chosen for training dataset, rest 21 readings were considered for testing dataset. The traits of both collections of datasets are shown in Table 2. It shows the characteristics of data such as mean, median, standard deviation etc. to check the comparison of the training and testing dataset. These have been used to validate the testing dataset. A representation of the procedure is shown in Fig. 4.

Table 2 Statistics of dataset.
Figure 4
figure 4

Representation of methodology.

Experimental results

Effect of number of jets

Figure 5 demonstrates the impact of the number of jets (Jn) on E20 at angles of inclination (θ) 0° (Fig. 5a), 1.5° (Fig. 5b), and 3° (Fig. 5c). The increase in E20 that occurs as Jn rises may be seen in Fig. 5. At angle of inclination 0°, Jn = 64 showed the largest increase, ranging from 0.21–0.25. With Jn = 64, aeration increased to 0.22–0.29 at angle of inclination of 1.5°. At angle of inclination 3º, Jn = 64 gives maximum aeration between 0.26 and 0.32.To sum up, the jet device with the maximum number of jets i.e., Jn = 64 provides E20 from 0.21 to 0.32 from angle of inclination 0º to 3º. This increase in E20 with an increase in the Jn for multiple plunging jets could be credited to more air/oxygen being present as a result of the increasing surface area of many jets in contact with the atmosphere becoming entrained.

Figure 5
figure 5

Effect of Jn on E20 at θ (a) 0° (b) 1.5° and (c) 3°.

Effect of discharge

The effect of discharge (Q) was also observed on E20, as shown in Table 3. It was found that the E20 increase is 33.4% and 20.54%;24.08% and 28.11%; 76.02% and 21.28% for Jn = 1 and Jn = 64, respectively, at angle of inclination (θ = 0°, 1.5°, and 3°) when Q is increased from 3.41L/s to 4.75L/s. It was found that higher Q can contribute to higher E20. The E20 increase was found in the range of 20–76% in plunging jets for Jn = 1 and 64. The increased number of jets and greater discharge provide a larger air–water contact area, which increases turbulence. This increased turbulence can be linked to an increase in E20.As the discharge is increased from 3.41L/s, to 4.75L/s, the jets acquire sufficient kinetic energy to pierce deeper into the tank, and more oxygen is pushed into the pool as a result of a larger air–water contact area.

Table 3 Values of E20 for different discharge, angle of inclination, and jet numbers.

Effect of angle of inclination

Table 3 shows the effect of angle of inclination (θ) on E20. It was observed that a higher angle of inclination contributed to a higher E20. The increase in E20 between θ from 0° and 3°was found to be higher than 25% varying in jet number (Jn) 1 and 64in plunging jets. The increased air–water contact area caused by the multiple jet holes and turbulence at a higher angle of inclination, as well as the increased velocity of the jet, are all responsible for the increase in E20 with θ.

Effect of Froude number

The impact of the Froude number of each jet (Fr) on the E20 at different angle of inclination and discharge rate is illustrated in Fig. 6a–c. The discharge rate and jet area affect the Fr and is determined using the following Eq. (9):

Figure 6
figure 6

Effect of Fr on E20 at θ (a) 0° (b) 1.5° and (c) 3°.

$$Fr= \frac{v}{\sqrt{g\times {D}_{{J}_{n}}}}; {\mathrm{ J}}_{{\text{n}}}=1, 2, 4, 8, 16, 32, 64.$$
(9)

Here, \(v\) is the average velocity measured in the downstream of the plunging jets after bubble formations (cm/s), and g denotes accelerated gravity (cm/s2). While DJn is the diameter of each jet determined using the following equation:

$${{\text{D}}}_{{{\text{J}}}_{{\text{n}}}}=2\sqrt{\frac{{\text{Jet}}\hspace{0.33em}{\text{area}}}{\uppi \times {{\text{J}}}_{{\text{n}}}}};\hspace{1em}{{\text{J}}}_{{\text{n}}}=1, 2, 4, 8, 16, 32, 64.$$
(10)

The various parameters used in the calculation are listed in Table 4. The total jet area is 30.75cm2, so the \({{\text{D}}}_{{{\text{J}}}_{{\text{n}}}}\) reduces with increased Jn values. The Fr value found to increase with increase in Q and decrease in \({{\text{D}}}_{{{\text{J}}}_{{\text{n}}}}\).

Table 4 Results of parameters used in Fr calculation.

In Fig. 6a–c, it is noted that E20rises with rise in Fr. The E20 also noted an increase with an increase in Q value from 3.41 to 4.75 L/s and θ from 0° to 3°. This is due to higher fluid velocity and increased inclination angle of the slope that affect the Fr of the fluid. As the fluid velocity increases, the Fr increases, indicating that the effects of inertia become more dominant. Similarly, increasing the inclination angle of the slope also leads to an increase in the Fr. Furthermore; the E20 of a system is affected by the Fr, as it influences the rate of air entrainment. When the Fr is low (Fr < 1), the flow is considered subcritical, and there is a tendency of air bubbles to rise slowly and follow the flow, resulting in less air entrainment in the fluid. Conversely, with high Froude numbers (Fr > 1), the flow is considered supercritical which cause air bubbles to break up into smaller ones due to high turbulence in water pool, leading to increased air–water interfacial area and thus enhanced air entrainment rate. Therefore, to attain maximum E20, an optimal Fr must be achieved.

Effect of hydraulic radius of jets

The cumulative hydraulic radius (HR) is extremely important for fluid mechanics in an open channel. It is determined using the following equation.

$${\text{HR}}={{\text{HR}}}_{{{\text{J}}}_{{\text{n}}}}\times {{\text{J}}}_{{\text{n}}};{\mathrm{ J}}_{{\text{n}}}=1, 2, 4, 8, 16, 32, 64,$$
(11)
$${{\text{HR}}}_{{{\text{J}}}_{{\text{n}}}}=\frac{{\text{Jet}}\hspace{0.33em}{\text{area}}}{\uppi {\times {{\text{D}}}_{{{\text{J}}}_{{\text{n}}}}\times {\text{J}}}_{{\text{n}}}}.$$
(12)

The impact of HR on theE20 at different discharge rates and angle of inclination is illustrated in Fig. 7a–c and it shows that there is increasing trend between HR and the E20. The E20is also noted to increase with an increase in θ from 0° to 3° and the Q value from 3.41 to 4.75 L/s. Wetted perimeter decreases with increasing HR, indicating that a smaller amount of water is in proximity to the channel portion which lowers the resistance to flow and enables more discharge to pass through it, resulting in increased E20.

Figure 7
figure 7

Effect of HR on E20at θ (a) 0° (b) 1.5° and (c) 3°.

Statistical analysis

Post hoc test

Table 5 shows ANOVA results among Jn and E20 values of plunging jets for the present study.

Table 5 ANOVA results with Jn and E20.

It is observed that the F and significance values are 22.372 and 0.00 (less than 0.05) respectively. Thus, the results are relevant with respect to the null hypothesis to be rejected. A Post-hoc analysis has been carried out in order to investigate the significance of differences between pairs of group means. The dependent variable considered for carrying out the post-hoc test was E20, and the independent variable was Jn. In Table 6, the Jn = 1 (single jet) was considered as control and the other multiple jets were found to have substantial differences in the mean. It was observed that significant value for Jn = 2 and Jn = 4 was higher than 0.05. Therefore, these jets are insignificant. It was also found that Jn = 8 to Jn = 64, have a significance value less than 0.05 hence they have significant impact on E20. Another observation from this table can be drawn that Jn = 64 has the highest mean difference, and thus it provides the maximum E20.

Table 6 Post-hocTukey’s analysis results for single and multiple jets.

The F value is ratio of variances of two data sets whereas degrees of freedom represent the interval group between two input parameters. In a multi-group comparison, it exhibits the statistical significance of difference in group means.

The F-value of 22.372 showed that the ratio of variance of one dataset was 22.372 times of the second dataset, implying that the means of these two variances were not equal and hence null hypothesis was rejected and alternate hypothesis is accepted. The fact was also verified by obtaining the significance value as 0.000 which is p-value which meant that recognised values obtained were significantly distinct from the sample population value which was initially hypothesised.

The p-value less than 0.05 is responsible for rejecting null hypothesis which is confirmed by the F-test value. The input parameter number of jets (Jn) has 7 inputs i.e., 1, 2, 4, 8, 16, 32 and 64 and degree of freedom (df) in this case is 7–1 = 6.The value of F-critical obtained from the F-table with degree of freedom (df) = 6 was found to be 5.9874 at confidence level 0.05. Since the F critical (5.9874) is less than F-calculated (22.372), the null hypothesis is rejected which showed that number of jets (Jn) affected the E20 significantly.

The column 1 and 2 showed the No. of jets (Jn) wherein column 1 is the reference column and performance of No. of jets in column 2 is compared with No of jets in column 1 by exhibiting significance (p) value which is required to be less than 0.05 for Null hypothesis to be rejected. The mean difference (I-J) showed the difference of E20 values for I and J columns. The standard error showed the error between observed value and mean values. The significance value showed the p-value which is significant if it is less than 0.05. The confidence level of 95% interval showed the values of mean difference (I-J) felled in the interval of lower and upper bound interval.

Linear regression analysis

Table 7 shows the regression statistics for which R (correlation coefficient) and R2 values are close to 1, which testifies the model to be satisfactory. Table 8 shows coefficient results with input parameters θ, Q, Jn, HRJn, and Fr based on which the model (Eq. (13)) was generated.

Table 7 Regression statistics for jets.
Table 8 Linear regression model.

The Table 8 showed the values of regression coefficient which represented the equation of regression with input parameters for the output parameter E20. Standard error gave the values with respect to standard deviation for regression line. The standard coefficients were the coefficients for regression function with constant value as 0. The T- test is the parametric test for comparing means of two groups.

The equation generated with help of the Table 8 is given as under:-

$${\text{E}}_{{{2}0}} \, = \,0.0{22}(\theta )\, + \,0.0{16}\left( {\text{Q}} \right)\, + \,0.000\left( {{\text{J}}_{{\text{n}}} } \right)\, + \,0.0{18}\left( {Fr} \right) \, - 0.0{59}\left( {{\text{HR}}_{{{\text{Jn}}}} } \right)\, + \,0.0{7}0.$$
(13)

Computational analysis and results

Assessment of ANN model

For the current study, ANN results were obtained from WEKA software. Up until the best outcomes were attained, many ANN architectures were tested. It can be tricky to select ANN’s defined functions to get the optimized model, such as hidden nodes, learning rate, and network geometry. Since ANNs only have one hidden layer during training, finding the ideal network geometry is obtained by hit-and-trial. The hidden layer count in this study is 10, the learning rate is 0.2, the momentum is 0.1, and the training time is 550. The ANN model's actual and predicted values for E20 during the training and testing phases are shown in Fig. 8. Since the majority of the points in Fig. 8 are fairly close to the tread line, the ANN-based model is appropriate for forecasting E20. The outcomes demonstrate a greater consistency between real and anticipated values. The statistical values for each model created for the current investigation are also shown in Table 9. It is found that ANN is the best-predicted model with the highest CC value of 0.9823 in the testing stage and errors, i.e., MAE value of 0.0098 and RMSE value of 0.0123.

Figure 8
figure 8

Actual and predicted value of E20 using ANN (a) Training, (b) Testing.

Table 9 Performances of ANN, M5P and RF model.

Assessment of M5P model

The M5P model generated for this study is used to predict E20.The M5P model was developed and validated using the testing and training datasets. In this study, the M5P was trained with a batch size of 100 and a leaf node instance limit of 4. Figure 9a and b show the observations of M5P. The accuracy of a model may be evaluated by comparing the observed data to the predicted value of the slope of the regression line (Fig. 9a,b). Moreover, Table 9 shows the fair result obtained from the M5P model with agreeable CC values in the model development and implementing stages of 0.9765 and 0.9728, respectively. Additionally, it is noted that the MAE and RMSE exhibit reduced values during the training phase but experience a modest rise during the testing phase.

Figure 9
figure 9

Actual and predicted value of E20 using M5P (a) Training, (b) Testing.

Assessment of RF model

WEKA software is also used for the RF-based model's implementation. The RF model is likewise developed using a hit-and-miss approach with some user-defined parameters. Using training and test datasets, the RF model's scattering details for experimental and projected values of E20 are shown in Fig. 10. It is evident that each scattering event exhibits the highest level of concordance with the regression line.

Figure 10
figure 10

Actual and predicted value of E20 using RF (a) Training, (b) Testing.

Comparison of soft computing-based models

This section compares the models ANN, M5P, and RF that are used in the current study to predict E20. To assess these models, five input parameters; θ, Q, Jn, HRJn, and Fr were taken into account. Table 9 shows the results of evaluating each developed model against three statistical evaluation criteria.

The agreement of each model with the data from experiments is shown in Fig. 11, and it is inferred from the graphical representation that the models developed for the study are good at anticipating E20. It is also required to evaluate the errors of each model, which are shown in Fig. 12, in order to reach the ultimate outcomes. It indicates that in both the training and testing datasets, RF exhibits more errors than other models. The ANN model demonstrated consistency both before and after training. The box plot of the model outcomes for the testing stage is shown in Fig. 13. The median and maximum values of the actual and ANN models are very close. Actual data has an interquartile (IQR) range of 0.122, while ANN, M5P, and RF have IQRs of 0.099, 0.095, and 0.079, respectively. The difference in the mean between the actual and observed values is minimal in the case of ANN (0.0006).

Figure 11
figure 11

Comparison of ANN, M5P and RF with actual data.

Figure 12
figure 12

Error values of ANN, M5P and RF in training and testing stage.

Figure 13
figure 13

Box plot with actual and soft computing techniques.

Sensitivity analysis

The most important input parameter in predicting the E20 of jets in an open channel flow was identified using sensitivity analysis. The outperforming model i.e., ANN was used to carry out sensitivity analysis. A new training dataset was created by gradually eliminating one input parameter, and the results were expressed in terms of CC, MAE, and RMSE. The extent to which the aforementioned evaluation factors changed demonstrates the variable's significance in influencing the E20.Findings from Table 10 indicate that, in comparison to other input variables, the angle of inclination of the tilting flume's bed is the most dominant variable and plays a considerable influence in forecasting the E20. The tilting flume's bed's angle of inclination increases the horizontal portion of water weight, resulting in higher water velocity. In addition to θ, Fr and Jn have a higher impact on E20.It is well established that aeration efficiency is dependent on θ and Jn. But when the aforementioned five input parameters are performing collectively in that case the analysis carried out for the sensitivity of each parameter becomes significant to establish their role in achieving E20.

Table 10 Sensitivity analysis using best fit model.

Discussion

In this work, plunging jets with Jn values of 1, 2, 4, 8, 16, 32, and 64 and a flow area of 30.75 cm2 are made from 7 acrylic sheets. The study examines the E20of jets in each sheet in an open channel using parameters likeθ, Q, Jn, HRJn, and Fr as inputs. Each parameter studied significantly affected E20. According to the findings, E20 rises as Jn, Q, and θ increase. Several plunging jets transmit oxygen at a rate that is much higher than that of a single jet being plunged into the water pool28,57. They also demonstrated that higher discharge results in better oxygenation. The results of the present investigation show that E20 increases as Jn increases. The results of the current investigation also suggest that E20 increases along with discharge. A higher jet impact angle may boost oxygenation by causing more bubbles to interact with the water in the pool as a result of deeper jet penetration and a higher jet angle, which would increase oxygen transfer61. According to the current study, aeration gets better as the flume θ rises, reaching a maximum of 0.32 (or 32%) at a 3° angle.

It is well documented in the literature that Fr affects turbulence in steady flows of water62,63. E20 was significantly affected by the Fr and the ratio of the water cross-sectional airflow to the duct cross-sectional64. Another piece of literature by Puri et al.65 demonstrates that an increase in discharge and oxygen transfer has accompanied a rise in Fr. The outcomes of the present study also confirm that E20 and Fr are directly related. It is inferred from the Figs. 5, 6, 7, and Table 3 that E20 increases with an increase in input parameters considered in the current study. As the input parameters have an impact on E20, therefore, H0 must be rejected.

Soft computing, as opposed to conventional computing, approximates complex real-world issues and is tolerant of flaws, ambiguity, partial truth, and assumptions. The human mind serves as an example for soft computing such as fuzzy logic, genetic algorithms, ANN, ML, and expert systems66. In the case of severely contaminated water management resources, the prediction of E20 is a study that should receive top priority. This work examines the performance of ANN, M5P, and RF soft computing models to predict the jet aeration in an open channel flow. Multiple statistical metrics have been used to measure the efficacy of different models such as CC, MAE, and RMSE. The outcomes demonstrate that ANN is the best predicted model to predict E20 while the least-performing model for the given dataset is the RF. According to the current study, all three used models can accurately predict E20. However, 10 hidden layers, 550 training time, 0.3 learning rate, and 0.2 momentums have increased the value of CC in the ANN model to 0.9823 over the CC value in M5P and RF to 0.9728 and 0.9682 in the testing stage, respectively, making ANN more effective. However, since M5P and RF both have CC values above 0.95, which is a competent level, their performance cannot be denied. In several research67,68, the best predictive model for problems is determined using the ML technique known as ANN. Researchers have also found that depending upon the number of the inputs and computational time Sensitivity analysis was also performed in order to understand the effects of each parameter on E20, and the results revealed that the angle of inclination of the tilting flume's input parameter is extremely sensitive to jet aeration in an open channel.

To sum up for the performance of the ANN model over RF model: In the present study, out of the total 63 readings recorded experimentally, 42 were chosen randomly for training dataset, whereas 21 were considered for testing dataset. Random forest may not impart good results for small data sets or low-dimensional data (data with few features). Processing high-dimensional data and feature-missing data are the strengths of random forest69. In this case, the small data set of 42 and 21 in training and testing datasets and small dimension of input parameter which were limited to five number i.e., angle of inclination (θ), discharge (Q), number of jets (Jn), hydraulic radius of each jet (HRJn), and Froude No. (Fr) can be the possible reasons for such performance. Whereas the performance of ANN has more manoeuvre capabilities by varying hidden layers, training time, learning rate, momentum rate etc. ANN models provide certain advantages over regression-based models including its capacity to deal with noisy data. ANNs consist of a layer of input nodes and layer of output nodes, connected by one or more layers of hidden nodes. Input layer nodes pass information to hidden layer nodes by firing activation functions, and hidden layer nodes fire or remain dormant depending on the evidence presented. The hidden layers apply weighting functions to the evidence, and when the value of a particular node or set of nodes in the hidden layer reaches some threshold, a value is passed to one or more nodes in the output layer. ANNs can incorporate uncertainties by estimating the likelihood of each output node.

The practical implication of the study is that the DO level in the water has been raised to the level at which the circular geometry of plunging jets is quite helpful in achieving E20 to the extent of 32%. This increase can be useful for the cultivation of sericulture, which is progressive aquatic life sustainability. On the other hand, the stakeholders using the oxygenated water can be beneficial for health-related issues. The enriched, oxygenated water can also be congenial to the agricultural and horticultural produce. The oxygenated water is produced by utilising the circular geometrical plunging jets under gravity in open channel flow, for which no electrical power supply is required.

Conclusions

The current study examines the angle of inclination, number of jets, discharge, Froude number, and hydraulic radius of jets to determine the efficacy of aerating deoxygenated water with a novel form of circular plunging jets produced from acrylic screens. The experimental findings demonstrated that aeration performance in multi-jets is better than that of a single jet. It was found that the E20 increase was in the range of 20–76% for Jn = 1 to 64 when Q was increased from 3.41L/s to 4.75L/s. It was also found that with an increase of θ from 0° to 3°the increase in E20was found to be higher than 25% in the said plunging jets. The post-hoc analysis proved that the number of jets from 8 to 64 significantly affect E20. All the parameters, except for the hydraulic radius of each jet, have positive effect on E20, according to a developed linear model. Further, E20 was predicted using soft computing methods, including ANN, M5P, and RF. It was found that ANN outperformed other applied models with a CC value of 0.9823 in the testing stage and errors, i.e., MAE value of 0.0098 and RMSE value of 0.0123. The sensitivity analysis results showed that the angles of inclination of the bed of the tilting flume, followed by the number of jets, are the highly influential parameters that affect aeration efficiency.