1 Introduction

Drought can be defined as the prolonged period of dry conditions [1, 2], and it is one of the most complicated and least understood natural hazards. It is one of the natural disasters that human being has suffered since the ancient era [3,4,, 4], and it is the costliest, recurrent natural disaster [5]. A meteorological drought can be expressed as a result of a precipitation shortage or as a lack of precipitation over a region for a period of time [4]. According to the Olukayodo [6], meteorological drought is one of the most essential and primary drought types that need to be addressed with care and attention. Meteorological drought simulation is a critical element in drought risk management [7]. Simulation models express the real behavior of a particular phenomenon needs to address appropriately to know the actual nature of a specific system [8]. Simulation of meteorological drought is more important than prediction to understand drought-prone scenarios effectively [9]. Mishra et al. [10] asserted that the accurate assessment and simulation of drought are essential for proper planning and management of water resources. Drought preparedness and mitigation depend upon the large-scale drought monitoring and simulation over a particular geographical area [11]. Many drought simulation models are already developed in the field of civil engineering. Mishra and Desai [12] developed ARIMA and multiplicative seasonal ARIMA models to forecast drought using SPI series. Zhang et al. [13] simulated SPI using the ARMA–GARCH model in China. Habibi et al. [14] simulated SPI12 by applying the Markov chain model. Morid et al. [15] simulated drought using artificial neural network (ANN) using effective drought index (EDI) and standard precipitation index (SPI). Mishra and Desai [16] simulated drought up to 6 months lead time by using and comparing linear stochastic models with the recursive multistep neural network. Barros and Bowden [17] applied self-organizing maps (SOM) and multivariate linear regression analysis to simulate SPI (in 12 months lead time) of Murray–Darling basin of Australia.

Mishra and Desai [12] argued that exponential smoothing techniques are the systematic empirical methods for simulation and forecasting of meteorological drought with a relatively higher accuracy rate. Holt and Winter originally developed exponential smoothing in the year of 1960, but later it was significantly modified by Kalehar [18]. Raha and Gayen [19] innovatively used exponential smoothing models and made realistic predictions of the drought of Bankura District, West Bengal [19]. They also argued that exponential smoothing might be a leading model in simulation and prediction of drought phenomena though seriously neglected in the framework of drought simulation. Exponential smoothing models have seasonal adjustability, and they can perform more realistic simulations than other statistical models [12, 20]. Forecasts produced by exponential smoothing methods are the weighted average of the past observations (Hyndman et al. [21]). In other words, the more recent observation gets higher associated weight. This framework generates reliable forecasts quickly for the wider range of time series, which is a great advantage (Billah et al. [22]). Raha and Gayen [19] stressed over the traditional exponential and Holt–Winter exponential smoothing, and also they have only considered 12 months time step in their research. They have failed to judge the seasonal flexibility of exponential smoothing. The present study simulates drought in different time steps using exponential smoothing and also comparatively assesses different exponential smoothing procedures in drought simulation and modeling. Station-wise assessment of intensity and occurrence rate of meteorological drought is the most necessary parameters of realistic simulation [23,24,25,26,27]. To visualize the drought phenomena effectively, spatial interpolation method IDW is applied. This study attempts to simulate drought (in 1 month lead time) using two exponential smoothing models in the time period of 1979–2014. Thus, this study has following objectives:

  • Simulation of meteorological drought at 1 month lead time using Holt–Winter and double exponential smoothing models

  • Comparative assessment between double exponential and Holt–Winter exponential method in the simulation of meteorological drought.

There is increasing evidence that climate change will affect West Bengal, and especially drought will be one of the most challenging issues for Bankura [2830]. Mishra and Desai [16] forecasted and analyzed drought in Kangsabati River Basin using an artificial recursive neural network. Markov chain model was applied to estimate dryness of Purulia District by Banik et al. [30]. Lohar and Pal [31] showed that the mean monthly pre-monsoonal rainfall has decreased, and the temperature has increased significantly in the last decades of the twentieth century. According to Ghosh, [32] Bankura and its associated regions are expected to receive less rain in monsoonal season [33]. Bankura is likely to experience a 1ºC rise in average temperature during 2025–2099 [3233]. Over the past few years, the impact of climate change has felt severely in Bankura [34,35,, 35]. Delay in arrival of monsoon season is observed in Bankura and its associated tract [32]. It is also noticed that summer becomes long, and drought has become more frequent [2, 36]. The problems are further being compounded with the growing population, lack of water resources and adaptation with water-intensive commercial crops. The Bankura, including Gangetic West Bengal (GWB), is less experienced in coping with drought. In such a pandemonium, spatiotemporal simulation of drought of Bankura is a good attempt.

Figure 1 denotes the location map of the study region. Bankura is an administrative division surrounded by Purba Bardhaman and Paschim Bardhaman District in the north, Purulia District in the west, Jhargram and Paschim Medinipur in the south. Although the average rainfall of Bankura District is 1400 mm, but much rainfall happens for the months of June to September [27]. Hot westerly winds prevail in Bankura from March to June in Bankura in the study area [36]. The study region becomes geographer’s attraction for last five to seven years due to excessive drought proneness-, poverty- and migration-related scenarios [27].

Fig. 1
figure 1

Location map of the study area

2 Datasets and methodology

Table 1 determines the list of meteorological stations with latitude, longitude and elevation (m). Rainfall data from 1979–2014 on daily bases downloaded from the official website of Climate Forecast System Reanalysis (CFSR) in SWAT format. According to Dile and Srinivason [37], NCEP dataset is one of the most widely used trusted datasets that can easily be used in drought simulation and monitoring purposes. Here, daily rainfall data have been converted to monthly rainfall values. Figure 2 denotes the location of meteorological stations marked within the Bankura District, West Bengal. Figure 3 signifies the overall methodological framework of this research. From rainfall data, SPI has been estimated in 3, 6, 12, 24 and 48 months time frame. Peak drought intensity (lowest SPI value), extreme to severe and moderate drought occurrence rate are assessed using observed and simulated models at above mentioned time steps. Hazard zones are also identified based on PCA score generated using three models.

Table 1 List of meteorological stations with latitude, longitude elevation, mean and standardized rainfall (1979–2014)
Fig. 2
figure 2

Locations of meteorological stations of Bankura District

Fig. 3
figure 3

Methodological framework

2.1 Determination of drought

SPI was developed by Mckee et. al. [38] to monitor drought. The responsiveness of precipitation deficits of SPI was found more reliable within shorter and longer time steps [39]. Because of simplicity of calculation, ability to address different drought-related issues at a glance, SPI is found most suitable and reliable can be applicable in different parts of the world [40,41,42]. Calculation of SPI should be based on different time steps to denote drought effectively [43,44,, 44]. Here, classification of SPI is based on 3 months, 6 months, 12 months, 24 months and 48 months time step. Three months and 6 months are considered as short term and 12, 24, and 48 months time frame is considered as long-term time frame. The analysis of SPI is performed in Meteorological Drought Monitoring (MDM) software prepared by Agrimetsoft team [44]. This software is a critical tool for calculating and comparing drought in multiple locations at different time steps. Salehnia et. al. [45] and agrimetsoft.com express detailed procedure of calculation of SPI through MDM software as follows:

After arranging the station-wise rainfall data (x) in column format, that data are incorporated into the MDM software. Using the input data, deviation of total rainfall (x) from long-term rainfall mean \((\overline{x})\) is estimated by the software. After completion of the process, that total deviation is divided by the standard deviation of rainfall (\({\updelta })\) which is basically as follows [45,46,, 46]:

$${\text{SPI}} = \frac{{X_{i} - \overline{X}}}{\delta }.$$
(1)

The long-term rainfall is then fitted to the probability distribution and then transformed into the normal distribution to the mean SPI for the location and the desired period is zero [36],

$${\text{SPI}} = \frac{a - M}{\delta }$$
(2)

where a is the individual gamma distribution, M is mean, and δ is the standard deviation of rainfall.

Based on Mckee et. al. [38], drought severity classes are identified in Table 2.

Table 2 Drought severity classes based on SPI

2.2 Exponential smoothing models

Exponential smoothing is a technique to smooth time series data using the exponential window function. Exponential smoothing technique assigns decreasing weights over time. This framework generates reliable forecast quickly and has a great advantage in real-time simulation [44]. The two basic exponential smoothing models are taken in this study which is as follows:

2.2.1 Holt–Winter exponential smoothing

Holt–Winter exponential smoothing is a statistical technique which is used here to simulate SPI at 3, 6, 12, 24 and 48 months time steps. It provides effective way of simulation of time series data. This method is efficient to capture seasonality [47]. There are three weights of smoothing constants, viz. α, β, Y representing the level, trend and season, respectively, used to upgrade component for each period of time, t. The value of these constants lies within 0 and 1. This value is selected depending on the weight (high smooth constant mean ensures more weight). The initial value taken in this research is 0.2. Holt–Winter model is applied in this research using following equations [47]:

Level:

$$L_{t} = \alpha \left( {Y_{t} - S_{t - m} } \right) + \left( {1 - \alpha } \right)\left( {L_{t - 1} + b_{t - 1} } \right)$$
(3)

Trend:

$$b_{t} = \beta \left( {L_{t} - L_{t - 1} } \right) + \left( {1 - \beta } \right)b_{t - 1}$$
(4)

Seasonal:

$$S_{t} = \Upsilon \left( {Y_{t} - L_{t} } \right) + \left( {1 - \Upsilon } \right)S_{t - m}$$
(5)

Fitted:

$$F_{t + h/t} = L_{t} + hb_{t} + S_{{t + h - m\left( {k + 1} \right)}}$$
(6)

where Lt is the level, bt is the trend, St is the season, Yt is the SPI, and t is the time period of Lt, bt, St and Yt components. Ft is the forecast value ahead of one period, α, β, Υ level, trend and seasonal smoothing coefficient estimators, respectively, St is the seasonal duration. K is the integer part of (h − 1)/m, which ensures the estimates of seasonal indices, used for forecasting, come from the final year Y, i.e., here SPI. The level equation shows a weighted average between seasonally adjusted observation \(\left( {Y_{t} - S_{t - m} } \right)\) and the non-seasonal forecast \(\left( {L_{t - 1} + b_{t - 1} } \right)\) for time t. The initial value of seasonal component s1 is determined using Eq. (7),

$$S_{1} = Y_{1} - L_{m} ;\;\;S_{2} = Y_{2} - L_{m} \ldots S_{m} = Y_{m} - L_{m}$$
(7)
$$L_{s} = \frac{1}{m}\left( {Y_{1} + Y_{2} + Y_{3} + \cdots Y_{m} } \right).$$
(8)

The Holt–Winter model is also known as Winter’s triple exponential smoothing [47] as this method was originally developed by Winter. In this method, the seasonal component is expressed in absolute terms with the scale of the observed series and in the level equation is seasonally adjusted by subtracting the seasonal component. Within each year, the seasonal component will add up approximately zero. Here, Yt − Lt in Eq. 5 is the weighted average of seasonally adjusted observation of SPI. Non-seasonal simulated term is \(L_{{{\text{t}} - 1}} + b_{t - 1}\) for time t. This method has great adjustability in interpreting seasonal and non-seasonal behavior [44].

2.2.2 Double exponential smoothing

Double exponential smoothing is applicable when the data show a trend. In this method, seasonal component and trend component are updated in each period [47, 48]:

Simulated:

$$F_{t + h/t} = S_{t} + hb_{t}.$$
(9)

Seasonal:

$$S_{t} = \alpha *x_{t} + \left( {1 - \alpha } \right)*\left( {s_{t - 1} + b_{t - 1} } \right).$$
(10)

Trend:

$$b_{t} = \beta \left( {S_{t} - S_{t - 1} } \right) + \left( {1 - \beta } \right)b_{t - 1}.$$
(11)

Here, \(S_{t}\) is the weighted average observation of Yt. The one-period-ahead simulation is given by \(S_{t} + hb_{t}\). Equation (11) shows that bt is the weighted average of estimated trend at time t. The h-step-ahead simulation is equal to the last estimated level plus h times the last estimated trend value. Thus, Eq. 9 is the linear function of h.

2.3 Drought evaluation parameters—Intensity and frequency

Wu et. al [39], Ghosh [27] and Hayes et. al. [49] identified intensity and occurrence rate as the basic ingredients of drought. Statistical models may be applied to simulate SPI in different time steps. Comparative assessment of different statistical models is one of the best keys in spatiotemporal simulation of drought (Hayes et. al. [50], Ghosh [51], Majumder et. al. [52], Touma et. al. [53], Chen et. al. [54]). Welner and Santner [55], Jeong et. al. [56] and Aryal and Zhu [57] analyzed changing pattern of drought frequency and intensity in different time steps. The present research utilizes peak intensity and occurrence rate as the drought evaluation parameters:

2.3.1 Peak intensity (PID)

ID denotes departure of a climate index from its normal value (Thomposon [58]). Here, drought event is defined as a period in which SPI is continuously reaching a value toward − 0.99 or less. Here, PID denotes the lowest value of SPI which is simulated by observed and exponential smoothing models at different time steps. Lesser the value (− 0.99 or less) more will be the drought intensity.

2.3.2 Occurrence rate of drought (FD)

FD is used to assess the drought liability during the study period (Wang et. al. [59]). The number of droughts per 35 years is calculated as:

$$F_{{{\text{D}}j \cdot 35}} = \frac{{N_{j} }}{j \cdot n} \times 100\left( \% \right)$$
(12)

where \(F_{Dj \cdot 35}\) is the frequency of droughts for timescale j in 35 years, \(N_{j}\) is the number of months with droughts for the timescale j in the n year set, j is the timescale (e.g., here, 3, 6, 12, 24 and 48 months), and n is the number of years in the dataset. FD is also known as the frequency of drought.

Drought intensity and occurrence rate are estimated using observed and simulated SPI in 3, 6, 12, 24 and 48 months time steps, respectively.

2.4 Drought risk zone estimation using PCA

Principal component analysis (PCA) is one of the most effective and popular techniques to estimate drought risk zones (Cai et. al. [60], Dinpashoh et. al. [61]). PCA is a multivariate technique that reduces the dimensionality of the dataset and computes a set of new orthogonal variables in decreasing order (Demsar et. al. [62]). The technique of Joliffe [63] has been followed in this research, to estimate principal components. In this study, principal components are estimated in 3 months, 6 months and 48 months time steps. Here, the technique of Joliffe 2002 is used to calculate principal components using intensity and frequency of drought in different time steps. Here, it is assumed that x is a vector of p random variables. In this study, PCA is concerned with the correlation and covariance. A linear function \(\alpha^{\prime } x\) of the elements of x having maximum variance where α1 is a vector of p constants α11, α12,, α1p, and denotes transpose, so that

$$\alpha^{\prime}x = \alpha_{11} x_{1} + \alpha_{12} x_{2} + \alpha_{13} x_{3} + \cdots + \alpha_{1p} x_{1} = \mathop \sum \limits_{j = 1}^{p} \alpha_{1j} x_{j}.$$
(13)

Next, a linear function \(\alpha_{2}^{\prime } x\) uncorrelated with \(\alpha_{1}^{\prime } x\) having maximum variance, and so, at the k-th stage a linear function \(\alpha_{k}^{\prime } x\) is found that has maximum variance subject to being uncorrelated with \(\alpha_{1}^{\prime } x,\alpha_{2}^{\prime } x, \ldots \alpha_{k - 1}^{\prime } x\). The k-th derived variable, \(\alpha_{k}^{\prime } x\), is the k-th PC. Here, \(\alpha_{1}^{\prime } x\) the vector \(\alpha_{1}\) maximizes such that \({\text{var}} \left[ {\alpha^{\prime}_{1} x} \right] - \alpha_{1}^{\prime } \sum \alpha_{1}^{\prime } = 0\). To maximize \(\alpha_{1}^{\prime } \sum \alpha_{1}^{\prime }\), here the Lagrangian multipliers are used. \(\alpha_{1}^{\prime } \sum \alpha_{1}^{\prime } - \gamma \left( {\alpha_{1}^{^{\prime}} \sum \alpha_{1} - 1} \right)\) is maximized where \(\gamma\) is the Lagrangian multiplier. With respect to \(\alpha_{1}\), differentiation of the function \(\sum \alpha_{1} - \gamma \alpha_{1} = 0\) is done. Now, here \(I_{p}\) is considered as the identity matrix \(\sum \alpha_{1} - \gamma I_{p} = 0\) where \(\gamma\) is the eigenvalue of ∑ and \(\alpha_{1}\) is the corresponding eigenvector. To decide which of the p eigenvectors gives maximum variances, the condition \(\left[ {\alpha^{\prime}_{1} x} \right] = \alpha_{1}^{\prime } \sum \alpha_{1}^{\prime } = \gamma\) is used. The k-th PC of x is \(\left[ {\alpha^{\prime}_{k} x} \right] = \gamma_{k}\) where \(\gamma_{k}\) is the k-th largest eigenvalue of ∑. \(\alpha^{k}\) is the corresponding eigenvector. The second PC \(\alpha^{\prime}_{2} x\) maximizes \(\alpha^{\prime}_{2} \sum \alpha_{2}\) subject to being uncorrelated \(\alpha^{\prime}_{1} x\) or equivalently subject to \({\text{cov}} \left( {\alpha^{\prime}_{1} x,\alpha^{\prime}_{2} x} \right) = 0\), i.e., \({\text{cov}} \left( {x,y} \right)\) signifies covariance between x and y. Differentiation with respect to \(\alpha_{2}\) gives us \(\sum \alpha_{2} - \gamma \alpha_{2} -\upphi \alpha_{1}^{^{\prime}} \alpha_{1} = 0\) and multiplying the covariance equation with \(\alpha_{1}^{^{\prime}}\) \(\alpha_{1}^{^{\prime}} \sum \alpha_{2} - \gamma \alpha_{1}^{^{\prime}} \alpha_{2} -\upphi \cdot \alpha_{1}^{^{\prime}} \alpha_{1} = 0\) is obtained. So, In case of second PC, \(\sum \alpha_{2} - \gamma \alpha_{2} = 0\) equation is obtained, and with respect to identity matrix, \(\sum \alpha_{2} - \gamma I_{p} = 0\) equation is obtained. Here, \(\alpha_{2}\) is the corresponding eigenvector. The success of PCA is due to following two optional properties (Zou et. al. [64]):

  1. a.

    Principal components sequentially capture maximum variability in the given dataset X, thus generating a minimum information loss

  2. b.

    Principal components are uncorrelated so one principal component can be mentioned without referring to others;

2.5 Spatial analysis through IDW method

In order to generalize calculated SPI values in six stations to whole study area, inverse distance weighting (IDW) interpolation is used where power was considered as 2, searching neighborhood was standard with at least 10 neighborhoods. The inverse distance weighting method is an established deterministic method of SPI mapping (Alam et. al. [24], Attore et. al. [65]) which can be applied in little or more spatial information. IDW is one of the frequently used techniques of spatial interpolation and mapping (Lu and Wong [66]).

2.6 Interrelationship assessment, error judgment and significance test of the models

2.6.1 Interrelationship assessment between models using R2 value

Pearson’s correlation coefficient is the popular measure of the linear correlation between two variables. The value lies between − 1 and + 1 where − 1 represents total negative correlation and + 1 represents total positive correlation between variables, whereas 0 represents no correlation.

$$r = \frac{{\mathop \sum \nolimits_{i = 1}^{n} \left( {x_{i} - \overline{x}} \right)\left( {y_{i} - \overline{y}} \right)}}{{\sqrt {\mathop \sum \nolimits_{i = 1}^{n} \left( {x - \overline{x}} \right)^{2} } \sqrt {\mathop \sum \nolimits_{i = 1}^{n} (x - \overline{x)}^{2} } }}$$
(14)

where xi is the observed value at time i, yi is the simulated value at time i, \(\overline{x}\) is the mean of the observed value, and \(\overline{y}\) is the mean of the simulated value. Positive R2 value indicates direct correlation, and negative value indicates inverse correlation.

2.6.2 Estimation of root mean squared error (RMSE)

Root mean squared error (RMSE) serves to aggregate the magnitude of errors in simulation of various time steps. RMSE has the great adjustability in calculation of accuracy measures in different time steps. Mavromatis [9] used RMSE in assessment of simulation models. The formula of RMSE is as follows

$${\text{RMSE}} = \sqrt {\mathop \sum \limits_{v = 1}^{n} \left( {O_{v} - M_{v} } \right)^{2} /n}$$
(15)

where Ov denotes observed value, Mv denotes the simulated model value, and n is the number of observation.

In this study, Holt–Winter model (HW) and double exponential (DE) models are considered for simulation. So, two RMSEs are obtained:

$${\text{RMSE}}_{1} = \sqrt {\mathop \sum \limits_{v = 1}^{n} \left( {O_{v} - {\text{HW}}_{v} } \right)^{2} /n}$$
(16)
$${\text{RMSE}}_{2} = \sqrt {\mathop \sum \limits_{v = 1}^{n} \left( {O_{v} - {\text{DE}}_{v} } \right)^{2} /n}.$$
(17)

Thus, combined RMSE can be stated as (from Eq. 16 and Eq. 17):

$${\text{RMSE}}_{1} + {\text{RMSE}}_{2} = \sqrt {\mathop \sum \limits_{v = 1}^{n} \left( {O_{v} - {\text{HW}}_{v} } \right)^{2} /n} + \sqrt {\mathop \sum \limits_{v = 1}^{n} \left( {O_{v} - {\text{DE}}_{v} } \right)^{2} /n}.$$

High RMSE value means there is a larger difference between the actual and simulated drought exists in the particular time step at a particular year or months (Zalenski et. al. [67]). Lower RMSE value indicates higher accuracy.

2.6.3 Anderson–Darling test (AD test)

Anderson–Darling test compares empirical cumulative distribution of sample data series. This test is very effective when the data series becomes sufficiently large. The Anderson–Darling test is the hypothesized distribution is F, and cumulative distribution is Fn and the formula can be written as

$$A^{2} = n\mathop \int \limits_{ - \infty }^{\infty } \frac{{(F_{n} \left( x \right) - F\left( x \right)^{2} }}{{F\left( x \right)\left( {1 - F\left( x \right)} \right)}}{\text{d}}F\left( x \right).$$
(18)

2.6.4 Kolmogorov–Smirnov test (KS test)

Kolmogorov–Smirnov test is a nonparametric test of the equality of continuous one-dimensional probability distribution with compare of a sample with reference probability distribution (Kolmogorov [68] and Smirnov [69]). Kolmogorov–Smirnov test statistic can be expressed as (Razali et. al. [70])

$$F_{n} \left( x \right) = 1/n\mathop \sum \limits_{i = 1}^{n} I_{{\left[ { - \infty ,x} \right]}} \left( {X_{i} } \right)$$
(19)

where \(I_{{\left[ { - \infty ,x} \right]}} \left( {X_{i} } \right)\) is the indicator function, equal 1 if \(\left( {X_{i} } \right)\) ≤ x and equal to 0 otherwise.

The Kolmogorov–Smirnov statistic of a given cumulative function F(x) is

$$D_{n} = \mathop {\sup }\limits_{x} (F_{n} x - F_{x} )$$
(20)

where sup is the supremum of the set of distance between the \(F_{n} x\) and \(F_{x}\). In our case, this model has been run at 95% significance level.

2.6.5 Ryan joiner test (RJ test)

Ryan Joiner test assesses correlation of a given data series (Yap and Sim [71]). It assesses the strength of the correlation of a given data matrix. If it falls below the critical value, the null hypothesis can be rejected and alternative hypothesis can be accepted. As the correlations of datasets are represented by this test, this procedure is important in judgment of simulation models.

3 Results

Figure 4a–j represents the scatter plots which represent the interrelationship between different model variables. The interrelationships of different model variables show that all the models at every time steps are strongly correlated with above 90% R-squared value. About 96–98% correlation coefficient is observed between double exponential smoothing and observed models. Correlation coefficient value decreases slightly between observed and Holt–Winter exponential smoothing. Significance level of every model is estimated by KS, AD and RJ test. Almost all models are significant at 0.01, 0.05 or 0.005 significance level. Table 5 (in supplementary file) represents the results of significance test.

Fig. 4
figure 4

Correlation between different model variables in different time steps

3.1 Station-wise comparison of 3 months and 6 months SPI

According to observed model, at the 3 months time step, drought (SPI value − 3.125) is at the peak in November 1979 at the station 229,875. At the same time step, according to observed model, extreme to severe (ES) and moderate drought occur at the rate of 8% and 10%, respectively, at the station 233,869. According to observed model, at 6 months time step, peak drought (SPI value − 5.061) is observed in January 1980 at station 233,875. At the same time step, according to observed model, ES and moderate drought occurrence rate (8% for ES and 10% for moderate drought) is highest at 233,869 and 229,875 stations. At, 3 months and 6 months time steps, simulated curves follow observed curves to denote dry and wet conditions (Fig. 5a, b, d, e, g, h). At 3 months time step, SPI simulated by double exponential model is characterized by − 2.387 value (peak intensity) which is observed in January 1993. At 3 months time step, the Holt–Winter model-simulated drought is at its peak in January 1979 at station 233,872. At 6 months time step, double exponential smoothing simulated drought reaches at the peak in April 1980 at the station 233,872. Similarly, Holt–Winter model-simulated drought at the same time step reaches at the lowest level with -2.361 SPI value which is occurred in February 1983 at station 229,872. Drought (category ES) simulated by double exponential and Holt–Winter model reaches at the highest rate of occurrence (about 7% for double exponential and 11% for Holt–Winter) at the station 229,872. Similarly, double exponential and Holt–Winter exponential model simulated moderate drought occurs at the highest rate (e.g., 13% for double exponential and 10% for Holt–Winter) at stations 233,869 and 229,875, respectively. Overall at 3 months and 6 months time steps, drought intensity is at its peak in post monsoon phase. Station-wise drought occurrence rate and drought intensity at different time steps are denoted in Table 4 (in supplementary file).

Fig. 5
figure 5

3-Month, 6-month and 12-month observed and simulated SPI over 1 month lead time of some selected stations (for the efficient understanding, we have mentioned here three stations—station 229,872 and 229,875 have similar nature of drought with station 233,869. 233,872 station has similar characteristics of 233,875)

3.2 Station-wise drought comparison of 12 months, 24 months and 48 months SPI

According to observed model, at 12 months time steps, drought intensity is at its peak with SPI value -5.291 which is observed in December 1979 at station 233,872. At the same time step, according to observed model, ES and moderate droughts are observed in their highest rate of occurrence at the station 229,875 and 233,875. At 24 months time steps, according to observed model, peak drought intensity (SPI − 2.391) is observed in December 1980 at the station 233,872. At 24 months time steps, ES and moderate droughts are noticed with 4% and 14% occurrence rate, respectively, at 233,872 and 233,875 meteorological station. Similarly at 48 months time steps, drought is intensified at the station 233,872 in December 1986 with − 1.513 SPI value. At this time step, ES and moderate drought reach their highest rate of occurrence (e.g., 5–15%) at 233,872, 233,875 and 233,869 station.

The simulated SPI at long-term time frame also follows the observed curves (Fig. 5c, f, i and Fig. 6–f). The SPI (at 12 months time steps) simulated by double exponential model reaches at the lowest level (SPI − 5.667) in July 1980. Holt–Winter model-simulated drought (category ES) at 12 months time steps reaches at the peak (SPI value − 3.771) in April 1980 at the station 233,872. At 24 months time steps, drought simulated by double exponential and Holt–Winter model reaches at the peak with − 2.291 SPI value which is observed in 1984 (please see Table 4 in supplementary section). At 12 months time step, the double exponential model simulated ES and moderate drought reach their highest rate of occurrence (about 13% and 11%, respectively) at the station 233,869. At the same time frame, Holt–Winter model simulated ES and moderate droughts are found in their highest rate of occurrence (about 14%) at 233,869 and 229,872 stations, respectively. At 24 months time frame, double exponential smoothing simulated ES and moderate droughts are identified with 7% and 14% occurrence rate at 233,869 and 229,875 stations, respectively. At 48 months time step, the occurrence rate of extreme to severe drought is negligible. Moderate drought (double exponential model simulated) is observed with 17% occurrence rate at 48 months time step at 229,872 station. Figure 7a–f expresses station-wise drought occurrence rate.

Fig. 6
figure 6

24-month and 48-month observed and simulated SPI over 1 month lead time of some selected stations (for the efficient understanding, we have mentioned here three stations—station 229,872 and 229,875 have similar nature of drought with station 233,869. 233,872 station has similar characteristics of 233,875)

Fig. 7
figure 7

% of Occurrence of extreme to severe and moderate drought using observed and simulated models (in 1 month lead time) (% frequency of extreme to severe and moderate drought at 12 months and 24 months time step are almost similar with the 48 months time step)

3.3 Spatial assessment

Overall northwestern and western portions of the district face highest intensity of drought. This region also experiences highest occurrence rate of extreme to severe and moderate drought at almost all the time frames. At 3 months and 6 months time frame, south and southwestern portions experience − 2 to − 3 drought intensity value. On the other hand, − 3 to − 5 drought intensity is observed in north and northwestern portions of the study region (Fig. 8). At 12, 24 and 48 months time frame, the peak drought is observed at northwestern part of the study region (SPI value is − 1 to − 3). At 3 months and 6 months time frame, extreme to severe and moderate drought occurrence rate varies from 6 to 3% and 5 to 11%, respectively. This fact is applicable for all models at above mentioned time steps. Southern and southwestern portion of the region experiences 3–4% extreme drought and 5–6% moderate drought at all the time steps. Drought occurrence rate is noticed at a significant level (5–6% extreme to severe, 10–11% moderate drought) in western and northwestern portion of the district at 3 months and 6 months time steps. At 12, 24 and 48 months time frame, extreme drought is observed in northwestern part of the district with 2–4% occurrence rate. Figure 8a–o denotes visualization of drought intensity spatially. Figures 9a–o and 10a–o express extreme to severe and moderate drought occurrence rate spatially. As the time step increases, drought intensity decreases and extreme to severe drought becomes less frequent. On the contrary, as the time step advances, moderate drought becomes more frequent.

Fig. 8
figure 8

Spatial distribution of peak drought intensity based on observed and simulated SPI (in 1 month lead time)

Fig. 9
figure 9

Spatial distribution of occurrence (%) of severe to extreme drought (as per observed and simulated models (1 month lead time))

Fig. 10
figure 10

Spatial distribution of occurrence (%) of moderate drought [as per observed and simulated models (1 month lead time)]

3.4 Drought-prone zone identification

Table 3 identifies all variables used in PCA. At 3 months time steps V1, V2, V3, V10, V11,V12, V19, V20 and V21, at 6 months time steps V4, V5, V6, V13, V14, V15, V22, V23 and V24 and at 48 months time step V7, V8, V9, V16, V17, V18, V25, V26 and V27 are considered. Similarly, at 12 months and 24 months time step, V28 to V45 variables are used. For every time step, first four components represent near about 80% of the dataset that is why first four components are considered for determination of PCA. For every time step, northwestern and western portions are demarcated as extreme to severe drought proneness. On the other hand, the eastern edge and southwestern portions are at mild to normal drought-prone condition. At all the time frame, Saltora, Mejhia, Gangajalghati, Chattna, Bankura-I, Bankura-II, Indpur, Onda, Barjora, Sonamukhi and Hirbandh are the highest drought-prone blocks. Patrasayar, Indus, Kotulpur, Jaypur, Bishnupur, Taldangra and Simlipal are the moderate drought-prone blocks. Khatra and Ranibandh are the least drought-affected blocks. The drought-affected region is marked in Fig. 11a–f.

Table 3 Considered variables in PCA
Fig. 11
figure 11

Drought-prone zone identification using PCA

3.5 Assessment of root mean square error (RMSE)

At 3 months and 6 months time step, the highest RMSE is observed in 1998–1999 at 229,869 stations. Station 233,869 and 2338,75 achieve the highest RMSE in 2010–2012 at 3 months and 6 months time steps (Fig. 12a, b, d, e, g, h). 0.1 to 20 RMSE is observed in 12, 24 and 48 months time step. As time step increases, RMSE starts to decrease in a significant proportion. At 12 months time step, the highest error occurs in 2012–2014 at the station 229,869 and station 233,869 (Fig. 12c, f). At this time step, the highest RMSE is observed in 1998–2000 at station 233,875 (Fig. 12i). At 24 months time step, the 229,869 station at 1989–1991 achieved the highest error (Fig. 13a). However, station 233,872 and 233,875 achieved the highest error in 2012–2014 (Fig. 13c, e). Similar nature is also observed in 48 months time frame (Fig. 13b, d, f). High RMSE is observed at northern and northwestern parts at almost all the time frames (Fig. 14a–j). It is interesting to note that extreme to severe drought-prone stations face higher amount of RMSE. In the 3 months time frame, southern and southwestern portions are experiencing total 104–110 RMSE, whereas northern and northwestern portions are experiencing total 111–116 RMSE. At 6 months time frame, southern and southwestern portions are experiencing total 40–44 RMSE, whereas 45–51 total RMSE is observed in northern and northwestern portions. At 12 months, 24 months and 48 months time step, 0.12–14 RMSE is observed at southern and southwestern portion, whereas northern and northwestern portions experience 14–19 RMSE value. With respect to combined error of Holt–Winter and double exponential model, double exponential model produces better result with slightly low (2%–4%) RMSE.

Fig. 12
figure 12

Estimation of root mean square error (RMSE) of some selected stations () at 3 months, 6 months and 12 months time steps (for the efficient understanding, we have mentioned here three stations—station 229,872 and 229,875 have similar nature of drought with station 233,869. 233,872 station has similar characteristics of 233,875)

Fig. 13
figure 13

Estimation of root mean square error (RMSE) of some selected stations () at 24 and 48 months time frame (for the efficient understanding, we have mentioned here three stations—station 229,872 and 229,875 have similar nature of drought with station 233,869. 233,872 station has similar characteristics of 233,875)

Fig. 14
figure 14

Spatial distribution of root mean square error (RMSE) at different time steps

4 Discussion and conclusion

In summary, we simulate meteorological drought for Bankura District using double exponential and Holt–Winter exponential model at several time steps. We also compared the efficiency of these two models, which shows that double exponential smoothing model is more accurate one. In our analysis, intensity and frequency of drought act as most important parameters. Intensity and frequency of drought are portrayed using visual interpretative maps and statistical assessment for several time steps. Our study shows that peak intensity of drought decreases with increasing time steps (Fig. 8a–o) and the scenario of drought shifts from extreme to moderate condition (Figs. 5 and 6). RMSE also decreases with increasing time steps (Figs. 12 and 13). Our study reveals that drought scenario is worst for north and northwestern region of Bankura, while the effect of drought is less severe in other regions (southeast, southwest) (Fig. 11a–e). Furthermore, we perform principal component analysis by combining the results from the two models, to delineate drought-prone zones of the district (Fig. 11a–e). Overall the principal component analysis (PCA) reveals that north and northwest portions of the study area are the ES drought-affected zone, while drought is mild at southern portions of the study region.

This study includes several time steps for spatiotemporal simulation of drought, which makes the study interesting and unique. The identification and monitoring of different drought-related parameters at the same time and effective identification of drought-prone zones make this study fruitful one for implementation. This study may also be useful to take actions for improved resiliency of the water management infrastructure, including a more accurate drought forecasting tool for sustainable agricultural production in Bankura District.