Introduction

Rivers are a vital fresh water source for human consumption, irrigation, industrial purposes and ecosystem services. Temporal and spatial information about the water quality of river and its continuous improvement is essential for the management of these fresh water resources effectively. The decline in river water quality in Indian rivers results occasionally from natural processes (during flood pulses), but more profoundly due to anthropogenic activities like discharge of industrial effluents, domestic wastewater and municipal sewage, besides non-point agricultural drainage. Nonetheless, the industrial, domestic wastewater drainage as well as the agricultural run-off from the fields contributes to the enormous proportion of rivers’ pollution (Dutta et al. 2018). From a study by Akbarzadeh et al. (2018), the untreated wastewater discharged from wastewater treatment plant (Seine Aval) of Paris municipality into lower Seine River, the dissolved oxygen level dropped significantly in the downstream at 100 km segment (Aissa-Grouz et al. 2015). In China, regulating the sewage release and run-off from urban areas is crucial in recovery of the water quality of the Pearl River estuary (He et al. 2014). Natural processes such as temperature and precipitation along with anthropogenic factors affect the river water quality and steer to distinct attributes between seasons (Kumar et al. 2015). Hence, for an integrated water resources management, continuous monitoring and evaluation of river water quality is a prerequisite (Wang et al. 2013). For monitoring the quality of river water as well as the visualization of the rivers’ current condition (locally and temporally), the sampling networks appear as an exceptional provenance of information. Besides giving a general outline of the conditions at temporal scale, these sampling networks also provide enough evidence of the ecosystem progression at seasonal and geographical scale (Barakat et al. 2016).

For an effective management of the surface water bodies, there are certain techniques like modelling approaches and monitoring programs for identification of the water pollution to better check the regulation contraventions and also to empower the authorities for formulating some steadfast decisions in the form of guidelines and regulations for managing the water quality as well as quantity. Within the modelling approaches, water balance, pollutant transport and the water quality are studied comprehensively for an interactive model development (Lorenz et al. 2014). The monitoring approach involves the measurement of the water quality of the surface water body for an extended period of time. Such monitoring programs ends up giving an enormous and vague dataset encompassing huge physico-chemical parameter data, which sometimes becomes problematic for analysing and comprehending because of complexity. A lot of accessible indicators are present to assess the aquatic ecosystem quality and evolution due to various human-induced influences and pressures for extended period of time. However, little attention has been given to the development of some specific tools for rapid assessment of the water quality (Sorche et al. 2019). The surface water monitoring requires certain indicators and descriptors that incorporate the ecosystem specific features and processes ongoing within a system. Also, special consideration should be given to the ecosystem condition when susceptible to some disturbances like pollution (organic, inorganic) from nutrients like nitrates, phosphates, from heavy metals (As, Pb, Cu, etc.) and other toxic constituents like pesticides.

Water resources monitoring for the quality assessment is necessarily important for assessing its worth for healthy ecosystem, hygienic environment and for domestic and other uses like recreation, agriculture, aquaculture, etc. Assessing all the water quality parameters of a water body is a complicated task, and the bulky data so obtained after the analysis of all the samples and comparing it with the existing guidelines are uneasy to handle. So, the WQI system has been developed to simplify the impression of water quality status of a water body.

The only plus point of the WQI development is that they give us comprehensive idea of the water quality of the studied area. There are existing examples of the various internationally developed water quality indices like the US National Sanitation Foundation Water Quality Index (NSFWQI), British Columbia Water Quality Index (BCWQI), Oregon Water Quality Index (OWQI) and Canadian Council of Ministers of the Environment Water Quality Index (CCMEWQI) and various others existing and being added. These above-mentioned water quality indices meet all the criteria of an all-inclusive good index as they integrate all the parameters that could be studied for an aquatic ecosystem. WQIs are quite user-friendly and could be easily handled in computational tools (Shrestha and Basnet 2018). The principle of WQI has been revised to some extent to advance existing concept for certain other forms of water quality index development (Singh et al. 2018) for reflecting the water quality conditions in various environmental circumstances having only difference with the development of sub-index and aggregation function. There are several other indices like the Transitional Water Quality Index (TWQI) which integrates six variables like the dissolved oxygen, dissolved inorganic nitrogen and phosphorus, phytoplankton chlorophyll-a, benthic phanerogams and macroalgal species. A Groundwater Quality Index (GWQI) was developed by formulating the eight principal parameters of drinking water which are K+, Na+, Ca2+, Mg2+, SO−24, Cl, pH and total dissolved solid (TDS) to monitor the groundwater quality in Qazvin plateau area of Iran (Saeedi et al. 2010). Twenty-two water quality parameters were measured for the formulation of WQI in a study carried out in Varanasi, India (Chaurasia et al. 2018).

Physical, chemical and biological parameters can also be helpful for assessing the water quality of any particular area or specific source. If the values of these parameters exceed the limits, it becomes harmful for human health. Similarly, to describe the water quality, WQI is the most efficacious approach used to define the suitability of water sources for human consumption is water quality index (WQI). For the water quality assessment of different water resources, groundwater and surface water, predominantly rivers, WQI has been extensively applied. WQI has done an imperative work in management of water resources (Sutadian et al. 2016; Mohebbi et al. 2013). The multiple environmental parameters are effectively combined and converted into one value which reveals the water quality status. Consequently, in spite of measuring the results of the various parameters repeatedly, WQI syndicates and gives cohesive information about the overall water quality thus proving to be a potent approach for water quality assessment and management (Wu et al. 2018).

The prevailing water quality of any particular region varies both spatially and temporally depending upon the kind of daily routines and practices adopted by the population of the region, the occupation opted by the people of that region and various other influencing factors like the meteorological and bio-geophysical factors. Hence, water quality is greatly influenced by the combined effects of the above-mentioned factors and these play a major role in its assessment. With the close observation of all these factors that may change with time except to some extent the bio-geophysical factors, it becomes profoundly problematic to develop an index that is unanimously accepted and suitable for world-wide regions. Therefore, it becomes utmost important to develop an index that is specific to a particular region at a point of time. Also, long term (i.e. years) assessing and analysing the various parameters of a water body of a particular region is quite tedious work requiring a lot of labour and giving an index that is developed after years of analysis. Thus, the developed index gives the condition of long-term data though the situation of water body at present may not be same.

A systematic and continuous monitoring of the parameters is required so as to detect the fluctuations in the concentration of the physiochemical parameters. A rapid assessment of some selected parameters is important for WQI development of any region so that one could give a quick overview of the prevailing situation of a water body and convey the alarming situation to the government, policy makers and the general public in a simple and understandable manner as the local population are the real stakeholders. In the present study, a short-term assessment or the water quality is carried out for deciphering the condition of Gomti River in the stretch of Lucknow city by developing a rapid WQI to analyse and interpret an enormous dataset of Gomti River in Lucknow into useful information for its proper management.

Multivariate statistical approaches are the most apposite and extensively used methods for processing and examining the data obtained over time. It has the ability to treat bulk of spatial and temporal data from different monitoring sites into comprehensive information. Different statistical techniques like principal component analysis (PCA), cluster analysis (CA) and factor analysis (FA) are very useful for riverine studies as these have the ability to evaluate spatial as well as temporal river quality variations and have the aptitude to identify the potential water contamination sources (Sabia et al. 2018; Alilou et al. 2018; Li et al. 2017; Iqbal et al. 2017; Villas-Boas et al. 2017; Zhao et al. 2016).

Principal component analysis and cluster analysis were used for carrying out a study on the Thamirabarani river basin situated in South India related to its hydrochemistry (Kumarasamy et al. 2014). Khan et al. (2016) applied correlation analysis, PCA and CA components for identifying possible pollution causes, studying the seasonal variations and clustering of monitoring places of Yamuna and Ganga Rivers in Uttarakhand State of India. These statistical techniques have previously been used and approved by different workers at different regions/rivers systems (Diamantini et al. 2018; Nguyen et al. 2017; Ojok et al. 2017; Chowdhury et al. 2016). Statistical approaches like PCA and CA methods were used for classifying the sampling sites and identifying the underlying pollution source (Najar et al. 2017; Xue et al. 2017; López-Morales and Mesa-Jurado 2017). The above studies showed that for determining the fundamental relationships amongst various parameters related to water quality, identifying the pollution radices and grouping identical sampling sites into classes with analogous attributes, PCA and CA statistical methods are very important tools (Sharma et al. 2015).

To accomplish this goal, CA, PCA, Pearson’s correlation matrix along with water quality index were used to (1) ascertain the similitude as well as dissimilitude amongst the studied sampling sites and associated physiochemical parameters, (2) evaluate the contribution of affected parameters of water quality for temporal and spatial discrepancies in quality at the selected sites, and (3) identify the contaminant affecting water quality and their potential sources, and (4) develop a rapid water quality index for assessment of urban stretch of the river.

Materials and methods

Description of the study area

Gomti River, a groundwater as well as rainfall nourished surface body and also a tributary of the Ganges River system having paramount importance in the plains of Northern India is chosen for the present research. This alluvial river emerges from a lake (Fulhar-Jheel) present naturally within a forest (at an elevation of about 200 m; 28° 13′ 40″ N and 80° 10´ 70″ E) adjacent to Pilibhit in Uttar Pradesh province of the Indo-Gangetic Plain, around 50 km southwards of the foothills of the lower Himalayas. This perennial river that flows across the central and eastern part of Uttar Pradesh (U.P) criss-crosses a distance of roughly 960 km before it finally confluences with the Ganges River close to Kaithighat, Varanasi (Dutta et al. 2011) (Fig. 1). The river, with an approx. length of 33 kms in the stretch of Lucknow city, functions as a chief domestic water supply source to the city which is the capital of Uttar Pradesh. Consequently, the Gomti river gets back the domestic wastewater, municipal sewage, agricultural run-off and effluents from the industries (distilleries, sugar mills, agrochemicals, paper and pulp and many others) which is left untreated and discharged directly throughout its course from the Lucknow city along with towns of Jagdishpur, Sultanpur and Jaunpur (Dutta et al. 2011; Singh et al. 2004). Besides washerman and crematoriums, there are many other activities causing pollution in Gomti. Except during the monsoon season, the river is marked by listless flow round the year (Gupta et al. 2014).

Fig. 1
figure 1

Study area with location of sampling sites of Gomti River in Lucknow City, Uttar Pradesh, India

Sampling and monitoring of pollutants

The present study was done during the months of January to April to assess the water quality of Gomti River and obtain information of various physico-chemical parameters such as dissolved oxygen (DO), biochemical oxygen demand (BOD),nitrate (NO3),chemical oxygen demand (COD), pH, total dissolved solids (TDS), chloride (Cl), electrical conductivity (EC), ammonia (NH4+), nitrite (NO2), phosphate (PO−34), sulphate (SO−24), alkalinity (Alk.), hardness (Hard) and heavy metals (Pb, As, Cd, Cr) (Table 1).The sampling sites were chosen owing to their contamination vulnerability, specifically, such sites in proximity with the place where the stream enters in Lucknow were thought less polluted with civic contamination whilst sites in the mid-stretch of the city were thought to be more impaired by the contamination. Seven sites, namely NH-230 (Ghaila Pul), Pakka Pul, KGMC, Hanuman Setu, Birbal Sahani, Lohia path, NH-27 were chosen for studying the river quality (Fig. 2). The first site (NH-230), upstream of the Lucknow city is situated in reasonably low-pollution zone. Other four sites (Pakka Pul, KGMC, Hanuman Setu, and Birbal Sahani) are in the mid-stretch situated in the high pollution zone because there are several drains of wastewater. The last two sites (Lohia path, NH-27), in the downstream zone have moderate pollution levels because the river substantially recuperates in its progression. The sampling approach was such designed to understand the widespread pollution causes at important sites which characterize the river water quality by considering contributions from various pollution sources that influences the quality of water (Dutta et al. 2018).

Table 1 Physiochemical parameters of water with their units and methods of measurement
Fig. 2
figure 2

Sampling sites of Gomti River in Lucknow City

Statistical analysis

Reliable assessment of the dataset of numerous variables scrutinized in various month periods at different sampling sites was done using multivariate approach thereby performing Pearson’s correlation, principal component analysis/factor analysis (PCA/FA) and cluster analysis (CA) and ultimately a water quality index development was made. Statistical and computational analyses were performed using Microsoft Excel 2010 with XLSTAT 2015.1 and IBM SPSS 20 software.

Cluster analysis

In order to investigate the similitude of variation in water quality propensities between the sampling sites, a cluster analysis was done. A class of multivariate methods, the foremost resolution of which is to accumulate articles owing to the characteristics retained by them, is cluster analysis. Cluster analysis categorizes entities, so that every entity is alike to the other entity in the cluster with reverence to a preset selection standard. The resultant groups of entity then display strong internal homogeneity and strong external heterogeneity. The most usual method, i.e. hierarchical agglomerative clustering which offers inherent similarity associations amongst the sample and the whole dataset is usually demonstrated with a tree diagram called dendrogram.

Cluster analysis considered as a multivariate analysis technique classifies the entire dataset into various distinct categories. By virtue of the similitudes or extent amongst the various observed values for parent groups, this is further branched out into hierarchical and non-hierarchical techniques. In this study, the squared Euclidean distance technique was used that assembles the squares of the difference of each of the variables, thereby calculating the extent between the target clusters. Ward method has been applied for the cluster combination method (Majkic-Dursun et al. 2018; Nadiri et al. 2018).

Principal components analysis (PCA)

PCA is a method that recognizes the patterns amongst the various related variables. PCA transforms the bulk of inter-correlated variables by transforming them into independent variables of comparatively smaller size, thus interpreting the variance amongst them (Abdi and Williams 2010). It gives knowledge on the important parameters that is used to explain the complete dataset, minimized data and to encapsulate the relationship between water constituents with the slightest harm to the primary information. PCA is applied on a correlation matrix of reorganized data to describe the arrangement of the essential dataset. This is used to ascertain the hidden and obscured pollution sources (Zhao et al. 2012).

In this work, principal components analysis was employed to outline the statistical relationship between the parameters of water quality. The calculations were done on correlation matrix of the reorganized chemical constituents. The PCA scores were acquired from the uniform methodical data.

Development of water quality index (WQI)

WQI is a beneficial and exceptional assessment to portray the level of water quality through one index (Kansal 2018). The WQI calculation was done by the weighted arithmetic index method using equation:

$${\text{WQI}} = \mathop \sum \limits_{i = 1}^{n} {\text{Q}}_{i} {\text{W}}_{i}$$
(1)

Qi; sub-index for ith water quality parameter, Wi is the concentration of ith water quality parameter’s and n is the number of water quality parameters. The ratings of water quality index were given as excellent (100–95), good (94–80); fair (79–65) marginal (64–45); poor (44–0) by Singh and Saxena (2018).

Results and discussions

Spatio-temporal variations of Gomti River

Box and whisker plots are used to explain the spatio-temporal discrepancies of the water quality parameters (Fig. 3). In the boxplots, large whiskers and outliers specify the extensive deviation in parameter’s concentration. Due to great variation in low or high values, the mean concentrations of the distinct parameters are considerably influenced (Dabrowski et al. 2018). Hence, median is used in boxplots to specify the central tendency of the dataset in the diagram. The samples of water of the four different sampling months, i.e. January, February, March and April of seven sites on the river was studied. The EC of the river varied from (240–822 S/m). However, in the month of April, the EC was very low (256 S/m) in NH 230 while a very high value (822 S/m) was found in the Birbal Sahani site. The pH was moderate at all sites except KGMC which was very low (6.22) in January. NH 230 showed a very high pH in the months of February (8.6) and April (8.57) which exceeded the range of pH in those months. The TDS was found maximum in the month of April while at the same time the TDS of the NH 230 was very low in comparison to the rest of the sites in all the months. The NO3, NO2 and NH4+ concentrations were higher in April.

Fig. 3
figure 3figure 3

Box and whisker plots of the physiochemical parameters

DO decreased in April at all sites comparatively than the other months while BOD increased in April which clearly indicates that as temperature increases, BOD increases and the dissolved oxygen concentration decreases. BOD is inversely proportional to DO in respect with temperature (Diamantini et al. 2018). The Cl concentration increased in the month of April but was found low (11.66 mg/L) at NH 230 shown in Table 2. The heavy metals Pb, As, Cd, Cr showed a comparatively high concentration in April. Because of a momentous increase in temperature, most of the water quality parameters showed a substantial rise in the month of April. This could be the reason of the wide interquartile ranges and outskirts in the boxplot diagrams.

Table 2 Variation in physiochemical parameters of water at various sampling sites (Dutta et al. 2018)

Correlation

Correlation coefficient is a statistical tool that is used to define the extent of dependence of one variable on the other and to quantify the association amongst the two variables (Bajpayee et al. 2012). Correlation loadings measure the extent of familiarity amongst the variables and principal components (Furlan et al. 2019). The major loadings can be either negative or positive which indicates the implication of the magnitudes. The negative loading specifies that the contribution of the variables decreases with the decreasing loading in magnitude; and positive loading indicates an increase (Tashtoush and Al-Subh 2015). To illustrate the extent of dependence of a variable on the other, correlation coefficient has been applied. Before statistical analysis, the criterion given by Little and Rubin (2019) was used to standardize the data. Both EC and TDS are strongly correlated with each other. It indicates presence of pollution sources by human activities like run-off from agricultural or brick kilns fields into the river. Moderate correlation was observed NO3 with BOD, COD and DO and NH4+ correlate with PO−34and SO−24 which indicate degradation of organics (Table 3). Such kind of correlation, amid the parameters, is indicative of organic matter decomposition happening in the river along with the industrial discharge.

Table 3 Correlation of water quality parameters (Pearson coefficient)

Cluster analysis

To identify the clusters with spatial similarity amongst the sampling sites, cluster analysis was done into the dataset. CA categorized the sampling sites in two separate clusters, Cluster 1 and Cluster 2 (Fig. 4).

Fig. 4
figure 4

Dendrogram based on hierarchical clustering (Ward’s method) for seven sampling sites of Gomti River in Lucknow City

Cluster 1 was formed by sites Lohia Path, NH 27, Pakka Pul, NH 230, while cluster 2 comprises of KGMC, Hanuman Setu and Birbal Sahani. Cluster 1 corresponds to low-pollution comprising of the sites from extreme upstream and downstream of Gomti River in Lucknow and cluster 2 corresponding to high pollution comprises of the sites in the mid-stretch of Gomti River in Lucknow. Moreover, a deep analysis of cluster 1 revealed that Lohia Path, NH 27 and Pakka Pul had similar features compared to NH 230. In cluster 2, KGMC, Hanuman Setu and Birbal Sahani were grouped having similar characteristics. Polluted water sample in Cluster 1 exhibited far superior physical and chemical quality than the polluted water sample in cluster 2. The two clusters were developed including the sampling sites having similar natural characteristics and contaminated by alike pollution sources. The river water quality variances amongst the two clusters revealed variance in types and level of anthropogenic pollution loads. The enhanced quality of water at NH 27 and Lohia Path (downstream) indicated the rivers’ self-purification ability and adjustability (Sim and Tai 2018). The worsening of water quality in mid-stretch at KGMC, Hanuman Setu, and Birbal Sahani revealed significant severe pollution due to domestic wastewater, agricultural runoff, municipal solid waste discharge and effluents discharge by industries situated nearby the river bank. The results of the study indicated that two different types of water qualities were existent in the area where river is flowing. It inferred that one sampling site from each cluster is enough and may perhaps assist well in spatial evaluation of the quality of water of the particular area. This could be used for a speedy assessment of water quality (Martinez-Tavera et al. 2017). To reduce the necessity for several sampling sites, CA technique is a good tool.

Principal component analysis of quality of Gomti River

The PCA/FA depending on the correlation matrix has been implemented to discern the inherent interrelationships amongst the various water quality variables of all the studied sites, so as to observe their properties (Barakat et al. 2016). The factor loadings were categorized as ‘strong’, ‘moderate’, and ‘weak’ in accordance to the total loading values of 1–0.75, 0.75–0.50 and 0.50–0.30, respectively (Azid et al. 2015). High positive and negative loadings reveal that the variables are significant for the given source of pollution (Burstyn 2004). The factor loadings of all principal components (PC)/factors, eigenvalues (Ev), variability and cumulative variance are given in Table 4. A high value of eigenvalue clearly defines the significance of loadings. Sum of all eigenvalues are equal to variance (Abdi and Williams 2010). The graphs in the scree plot are extensively applied for categorizing the total factors present in the elementary data structure (Liu et al. 2003).

Table 4 Loadings of 18 variables on principal components for the whole datasets

Figure 5 shows sorted Ev from large to small as a function of the factor number. The scree plot graphs are used to ascertain the number of factors or the principal factors of the given data which clearly define that the first four factors are responsible for 88.97% of the total variance.

Fig. 5
figure 5

Scree plot of the eigenvalues (Ev)

In the present study, Ev was taken out from the PCs. Four PCs were acquired with Ev > 1 that described approximately 89% of the total variance of the dataset (Table 4). The variables with Ev < 1 were removed due to their low significance. The first PC accounting for 44.91% of the entire variance revealed high positive loadings for EC, TDS, NH4+, PO−34, SO−24,Hard, COD, moderate positive loadings for NO2, Cl, Cd, Cr; moderate negative loadings for NO3, and DO; weak negative loadings for pH, and Pb. The presence of such contaminants is indicative of organic as well as inorganic pollution from anthropogenic sources, such as domestic waste water, untreated municipal sewage discharge, release from industrial effluents and water treatment plants. Cd and Cr presence are indicative of industrial effluents resulting from the production of corrosion inhibitors and pigment which in turn could be toxic to aquatic life (Jaishankar et al. 2014). Presence of PO43−could be due to the detergent discharges from the municipal sewage and industrial effluents. Presence of NO2 and SO42− is due to fertilisers and livestock waste run-off from agricultural fields. The second PC responsible for 22.97% of the entire variance where Alk and BOD displayed high positive loadings, pH, NO3, NO2, DO and Cr moderate positive loadings, NH4+, PO−34 and As show weak positive but Cd displayed moderate negative loading. These including organic pollution possibly be related to inputs from point sources (municipal & industrial), non-point sources (agricultural fields run-off), livestock activities coupled with domestic influences (Khatri and Tyagi 2015). Cumulative effects of the anthropogenic pollution from different sources like untreated municipal discharges directly into the river via drain which is responsible for high BOD. Third PC contribute 11% of the total variance in which Pb and As showed moderate positive loadings, EC, TDS, Hard and COD showing weak positive loadings indicate presence of inorganic pollutant and Cl showed weak negative loading is indicative of the presence of a huge fraction of non-biodegradable waste by anthropogenic activity.

The fourth PC responsible for 10.10% of entire variance displayed moderate positive loadings for Pb, and moderate negative loading for Cd; weak positive loading for ph and NH4+, weak negative loading for BOD and As indicating of the industrial pollution. The fifth PC indicating for 5.96% of the variance showed weak positive loading for NO3, BOD and COD was indicative of organic as well as inorganic matter decomposition in the river. According to bioplot, F1 axis contributes 44.91% and F2 axis 22.97% of total variance and the point signifies physiochemical parameters and observations, i.e. sites. Birbal Sahani (mid-stretch) in the F1 on the right top corner showed loadings for NH4+, SO−24, PO−34, NO2, Hardness, COD, Cl, EC, TDS, Cr and Cd. Pakka Pul in the second axis on the left top corner had loadings for pH, Alk, BOD, DO NO3 and As (Fig. 6). Hanuman Setu, NH 27, KGMC comprising a group showed same pollution level while NH 230 shows a low pollution than other sites.

Fig. 6
figure 6

Biplot between physiochemical parameters and sampling sites

Water quality index of Gomti River

The current water quality index designates threatened quality of water in Gomti River attributable to recurrent release of waste effluents by numerous discharges poured down directly into the river by anthropogenic sources (Dutta et al. 2018; Sharma and Kansal 2011). Gomti river has a fair index of water quality when it enters the Lucknow city nearby Ghaila, whilst as it advances within the city premises at sites like Pakka Pul, KGMC, Hanuman Setu, Birbal Sahani the index worsens to marginal (Table 5). At Birbal Sahani, the index of water quality was 58.98 which are clearly evident from Table 5 and gives an impression that this particular site being located in the mid-Lucknow is terribly contaminated ascribed to the numerus drainage influences (domestic as well as municipal) from the main city. Once the river moves away from Birbal Sahani and travels downstream at sites like Lohia Path and NH 27, it is discerned that the index of the Gomti water again ameliorated to fair by virtue of the self-replenishing power of a river as it advances away from the heterogeneous pollution sources.

Table 5 WQI of Gomti river stretch in Lucknow city (Dutta et al. 2018)

Discussion

The statistical techniques are used on a vast scale for the temporal as well as spatial assessment of the physico-chemical, bacteriological parameters of the water quality of any water body but the underlying principles governing certain ongoing imperative processes within the riverine or aquatic ecosystem is not investigated thoroughly. Although a number of studies have been published emphasizing on the spatio-temporal variations of water quality parameters (physical, chemical, biological) and these studies have also emphasized on the multivariate statistical approaches like the PCA, FA, CA, DA and many more. In a study done on eight locations of Gomti river for 3 years to analyse 34 parameters, the use of multivariate statistical techniques was found to be quite significant for effectively analyse complex water quality data and generate results for better monitoring and management of lotic water bodies like Gomti (Singh et al. 2005a, b). A biennium study on 16 polycyclic aromatic hydrocarbons (PAHs) in the water and sediments of Gomti river have also been conducted which suggests that pyrolytic and petrogenic sources are contributing for the PAHs in the river. The two and three ring PAHs in water and three and four ring PAHs in the sediments were dominant in the river (Malik et al. 2011). A monitoring study on the heavy metals (Cd, Cr, Cu, Fe, Mn, Pb, Zn and Ni) contamination for 5 years period revealed that the water and the bed sediments of the Gomti river are severely impacted by the human activities. However, after calculating the geo-accumulation index of the metals at individual sites, it was found that the river bed is severely impacted by the heavy metals which may cause toxicity in the river affecting its aquatic diversity (Singh et al. 2005a, b).

Various water quality indexes have also been developed for quality monitoring purposes utilizing certain parameters including the heavy metals. In the present study also, heavy metals (Pb, As, Cd, Cr) analysis has been done and a water quality index has been developed including these heavy metals. Though the water quality index gives a clear understanding of the condition of any water body but it does not inform us about the background processes in an aquatic ecosystem. The outcome of any particular data greatly depends on certain factors like data quality, handling and data understanding prior to statistical analysis and data interpretation. Most of the research basically deals with the monitoring of the water quality and its management but does not address the major points that drive these problems. The research should be focussed on applying one’s own knowledge and wisdom first and then moving further ahead for other statistical complexity for interpreting the outcomes.

Keeping these questions in mind, one such study has been carried out on the Tay Ninh River of Dong Nai River basin in Vietnam where pollutant loaded (nutrients and organic loadings) water body has been studied for nitrification inhibition by means of statistical methods (PCA, CA, DA) (Le et al. 2017). Nearly at all stations, the mean ammonium concentration was found about 1.5 mg N/L, whilst the mean nitrate concentration was around 0.4 mg N/L. As per the study of Wetzel (2001), when the nitrification process operates properly in an unpolluted water body, the NO3–N to NH4–N ratio is 25:1 whilst the ratio fluctuated from 1:2 to 1:12.8 in the Tay Ninh River. The primary productivity of certain ecosystems is regulated by nitrogen because it is a vital nutrient required for the primary production of plants and algae. Nitrogen along with another nutrient phosphorus is needed in a proportion of 30:1 by the aquatic macrophytes. The nitrogen requirement of higher plants is twice greater than that of phytoplankton when compared with phosphorous. Through various studies, it is evidenced that nitrogen unavailability hampers the primary production in the higher plants (Durand et al. 2011). N in excess or deficit is both harmful for the healthy and balanced well-being of a sustainable aquatic ecosystem. While the excess causes serious problems of water eutrophication, water hypoxia, species diversity decrement, water habitat and ecosystem disruption (Durand et al. 2011); its deficit may result in the low primary productivity of the aquatic flora like the macrophytes and cyanobacteria.

Oxygen deficiency occurs when sufficiently huge quantity of ammonium due to untreated and partially treated wastewater is present in a lotic water body. In European rivers, nearly 0.1 mg N/L (nitrogen per litre) per day of nitrification rate has been documented. In these rivers, the bacterial population varies from 0.005–0.05 to 0.01–0.5 mg C/L (coliform/litre) per day and as high as 2.5 mg C/L per day in the case of unaltered natural forest rivers, large rivers and heavily polluted rivers with municipal sewage, respectively (Brion and Billen 2000). Inhibition occurs when the activity of the ammonium oxidizers and the nitrite oxidizers (chemo-lithotrophic bacteria) are negatively influenced by some factors like dissolved oxygen, pH, water temperature, substrate concentration and salinity (Chen et al. 2006). These obligate autotrophic bacteria that carry out the biological oxidation converting ammonium to nitrate makes energy via nitrification for fixing CO2.

BOD signifies the volume of oxygen used by two major processes: carbonaceous biochemical oxygen demand (CBOD) which is the decrease in the organic carbon matter by the metabolic process to CO2 with the help of micro-organisms. Nitrogenous biochemical oxygen demand (NBOD) is the volume of oxygen consumed in the biological oxidation process of converting ammonia to nitrate during nitrification. The total amount of oxygen required by carbonaceous and nitrogenous matter for extended period of time is the ultimate BOD. Considering this, the CBOD and NBOD should be separately analysed. There is another aspect of free-flowing rivers which is an amalgamation of various processes, which is sediment oxygen demand (SOD). The amount of oxygen consumed for the decay of the settled organic matter in the river is the SOD. In a free-flowing river, the SOD is nearly insignificant as the long-term organic matter aggregation is prohibited due to recurrent abrasion during floods and storm events. As Gomti is not a free-flowing river as it used to be before channelization, so the SOD should also be considered for analysis. The NBOD could significantly increase in the water body during the nitrification process of organic nitrogen and NH3–N conversion to nitrate by consuming oxygen. The CBOD is even smaller than NBOD even though the NBOD does not increase rapidly as NBOD is reliant on various environmental aspects like alkalinity, nitrifying bacterial populace, temperature and dissolved oxygen accessibility. Along with the CBOD, a comparatively great pressure on oxygen is exerted by NBOD. As for the oxidation of 1 mol of NH4+, 1.89 mol of oxygen are utilized thereby producing 1.98 mol of hydrogen ions. Therefore, NBOD and CBOD should be studied separately and SOD should also be considered.

The researchers have always made an attempt for studying the aquatic ecosystem for extended time span whether the objective is developing the water quality index (Tripathi and Singal 2019; Tian et al. 2019; Ewaid et al. 2018), transitional water quality index development (Giordani et al. 2009) or for long-term analysis of the physico-chemical and biological characterization of a riverine ecosystem (Jameel et al. 2018; Gupta et al. 2017; Le et al. 2017). Pollution is such an anthropogenic nuisance that never occurs for a season or two; it is constantly being added in a river day by day due to poor sanitation activities along the banks, the industrial and agricultural discharge and direct dumping. So, long-term studies are not enough for monitoring purposes. The plume of pollutant load is constantly moving from one place to another as river is a lotic ecosystem. So, the authors suggest short-term studies of a surface water body.

Analysing all the ongoing processes within an aquatic ecosystem, it prompts us to come on a viewpoint that merely the analysis, monitoring and index development does not signify the problematic status of any surface water body. As it is evident from the study of Le et al. (2017), a lot of nitrate formation from ammonium via nitrification is inhibited sometimes due to lack of pH, DO and certain other factors. These behind the scenes problems are not integrated in the development of any kind of water quality index developed so far. Though a rapid WQI development is required for monitoring but aspects such as nitrification inhibition, carbonaceous and nitrogenous biochemical oxygen demand, and sediment oxygen demand should also be considered at the same time. When these vital processes are considered and integrated for the evaluation of water quality status, then only the monitoring studies will give us a comprehensive understanding of the current status of an aquatic ecosystem.

Conclusion

In the current study, multivariate techniques like PCA/FA, CA were found to be efficient and realistic in assessing spatio-temporal differences of river water quality and for identifying contamination sources, site categorization besides identifying substantial parameters of Gomti water quality in Lucknow city. CA grouped the sampling sites into two important clusters (Cluster 1 (Moderately Polluted): Lohia Path, NH 27, Pakka Pul and NH 230; Cluster 2 (Heavily Polluted): KGMC, Hanuman Setu and Birbal Sahani) illustrating that water quality of Gomti river gradually worsened from upstream and downstream to middle stretch. The middle stretch was found to be most polluted as the major drains are concentrated within this stretch. PCA/FA helped in obtaining and recognizing the factors/sources accountable for river water quality differences in the study area. Boxplots proved to be a potent tool for spatio-temporal variation analysis of the river. The sources responsible for contamination are mixing of semi-treated/untreated urban wastewater through many drains, slum areas adjoining the river bank that release its untreated wastewater and garbage directly to the river, direct dumping of religious and household wastes by the localities, Dhobhighats (sites for washing clothes by the washer men) and crematorium on river banks, along with agricultural runoff. The latent reason behind the deterioration in water quality is the lack of awareness among the locals and their cultural practices such as immersion of idols and religious materials during and after various festivals. Also, the river channelization has deteriorated the water quality to a great extent as the water flow is controlled by barrages and weirs in this stretch, so the river is unable to replenish itself. To monitor rivers like Gomti, short-term monitoring and rapid water quality index development are the need of the hour but at the same time before coming on any conclusion, all the natural ongoing operations like nitrification inhibition, nitrogenous and carbonaceous biochemical oxygen demand, sediment oxygen demand of the river should be considered at the same time for water quality index development. Our study emphasized on short-term continuous monthly measurement of the water quality and index development of the river so that management decisions could focus on gradual improvements in the water quality.