Abstract
This study was designed to find the best-fit probability distribution of annual maximum rainfall based on a twenty-four-hour sample in the northern regions of Pakistan using four probability distributions: normal, log-normal, log-Pearson type-III and Gumbel max. Based on the scores of goodness of fit tests, the normal distribution was found to be the best-fit probability distribution at the Mardan rainfall gauging station. The log-Pearson type-III distribution was found to be the best-fit probability distribution at the rest of the rainfall gauging stations. The maximum values of expected rainfall were calculated using the best-fit probability distributions and can be used by design engineers in future research.
1 Introduction
Pakistan is located at a latitude of 33.6667° N and longitude of 73.1667° E in the Middle East, a well-known region of southwestern Asia situated in the northern and eastern hemispheres. Pakistan experiences a diversified climate throughout the year. The minimum temperature is as low as –25°C in northern areas, and the maximum temperature is as high as 55°C in southern areas. Most of Pakistan experiences a dry climate, while humid conditions prevail in northern areas. In Pakistan, monsoons and evaporation from western depressions are the sources of rainfall. Monsoons contribute 65 to 75% of the total rainfall in Pakistan. The most vital natural source of water for humans, animals and crops is rainfall that contributes to lakes and rivers. Predicting the future occurrence and distribution of rainfall based on the amounts received in previous years has proved to be difficult and the results unreliable. Hydrological events such as rainfall, which occurs as a natural phenomenon, are observed at the event scale. The efficient management and use of water resources can be enhanced by rainfall analyses using probability distributions and annual maximum daily rainfall [1]. The expected rainfall in different return periods is determined through probability and frequency analysis of rainfall data [2]. In order to reduce flood damages and design and construct hydrologic projects such as dams, dykes, and urban drainage systems, the management and implementation of water resource strategies require reliable data regarding extreme events with high return periods [3]. Various probability distributions are currently used to predict expected rainfall in different return periods, as rainfall varies with time and location [4]. Frequency analyses of rainfall data have been performed for different return periods [5-9]. The expected rainfall values in different return periods, which are greater than or less than those of recorded rainfall, are estimated using a fitted distribution. The damage caused by storms can be reduced by the precise estimation of extreme rainfall, leading to the efficient design of hydraulic structures. A number of probability models have been developed to depict the distribution of extreme rainfall at a site [3]. The choice of an appropriate distribution model is one of the major problems in engineering practice. The selection mainly depends on the available rainfall data at a particular site. To find a suitable distribution model that will provide accurate estimates of extreme rainfall, it is necessary to evaluate the available distribution models. The probability models most commonly used to estimate rainfall frequency are the normal, log-normal, log-Pearson type-III and Gumbel distributions. The objective of the study is to find the best-fit probability model and perform a probability analysis of 24-hour annual maximum rainfall in northern Pakistan, as rainfall in this area is the main source of water for the irrigation network in the country.
2 Material and methods
Probability distributions are basic concepts in statistics. The results of statistical experiments and their probabilities of occurrence are linked by probability distributions. Rainfall data from northern Pakistan were evaluated with four probability models to find the best-fit model. The probability models used include the normal (N), log-normal (LN), log-Pearson type III (LP3) and Gumbel (EVI) probability models.
2.1 Normal distribution
The normal distribution is the most useful continuous distribution of all the distributions. The probability density function (PDF) and cumulative distribution function (CDF) of the normal distribution are calculated using Eqs. (1) and (2), respectively:
where ‘μ’ is the location parameter, ‘σ’ is the scale parameter and ‘Φ’is the Laplace Integral.
In the normal distribution, the maximum value of expected rainfall (XT) corresponding to any return period (T) can be calculated using Eq. (3):
where ‘XT’ is the maximum value of expected rainfall,
The frequency factor (KT) is the same as the standard normal variate ‘z’, which is calculated using Eq. (5).
From Eq. (5), can be expressed as follows:
where ‘p’is the exceedance probability (p=1/T). When p>0.5, 1-p is substituted for ‘p’in Eq. (6).
2.2 Log-normal distribution
The log-normal distribution is a distribution of random variables with a normally distributed logarithm. The lognormal distribution model includes a random variable Y, and Log(Y) is normally distributed. The probability density function (PDF) and cumulative distribution function (CDF) of the log-normal distribution are calculated using Eqs. (7) and (8), respectively:
where ‘μ’ is the shape parameter, ‘σ’ is the scale parameter, ‘γ’is the location parameter and ‘Φ’is the Laplace Integral.
The log-normal distribution assumes that Y=In(X); therefore, the maximum value of expected rainfall (XT) corresponding to any return period (T) can be calculated using Eq. (9):
where
2.3 Log-Pearson type-III distribution
The log-Pearson type-III distribution has been widely and frequently used in hydrology and for hydrologic frequency analyses since the recommendation of this distribution by U.S. federal agencies. The probability density function (PDF) and cumulative distribution function (CDF) of the log-Pearson type-III distribution are calculated using Eqs. (12) and (13), respectively:
where ‘α’, ‘β’ and ‘γ’ are shape, scale and location parameters, respectively.
In the log-Pearson type-III distribution, the maximum value of expected rainfall (XT) corresponding to any return period (T) can be calculated using Eq. (14):
where
2.4 Gumbel (EV I) distribution
The Gumbel distribution named in honor of Emil Gumbel, and also known as the Extreme Value Type I (EV I) distribution, is a continuous probability distribution... This distribution can be applied to model maximum or minimum values (extreme values) of a random variable. The probability density function (PDF) and cumulative distribution function (CDF) of the Gumbel distribution are calculated using Eqs. (17) and (18), respectively:
where ‘σ’ and ‘μ’ are the scale and location parameters, respectively.
The Gumbel distribution can be used to calculate the maximum value of expected rainfall (XT) corresponding to any return period (T) using Eq. (19):
where
3 Results and discussion
The northern area of Pakistan is surrounded by the Himalayan, Karakoram, Hindu Kush, and Pamir mountain ranges which with high peaks of between 6500 m to 8600 m. The snowmelt from these mountains, combined with glacier melt and monsoon rainfall, contribute to the many rivers, most notably the Indus River, that Pakistan has relied on to develop an advanced irrigation canal network However, the distribution and quantity of monsoon rainfall varies widely throughout the year, and occurs due to seasonal winds and western disturbances. In northern areas, such as Khyber Pukhtonkhuwa and Balochistan provinces, the maximum rainfall occurs during December to March, and in Punjab and Sindh, the maximum rainfall (50-75%) occurs during the monsoon season [10-15].
The 24-hour annual maximum rainfall data from six rainfall-gauging stations in northern Pakistan were used in this study. The locations of these stations are shown in Figure 1. A summary of the statistics is presented in Table 1. These statistical parameters are used to calculate the estimated 24-hour annual maximum rainfall in different return periods using different probability distributions. Of the six selected stations, Oghi has 46 years of rainfall data, spanning from 1961 to 2010. Three stations, including Kalam, Daggar and Mardan, have 44 years of rainfall data, spanning from 1962 to 2009, 1963 to 2010 and 1963 to 2010, respectively. Two stations, including Puran and Besham Qilla, have 38 years of rainfall data, spanning from 1963 to 2004 and 1969 to 2010, respectively.
Statistical parameters | Selected rainfall gauging stations | |||||
---|---|---|---|---|---|---|
Kalam | Oghi | Daggar | Mardan | Puran | Besham Qilla | |
Mean | 62.37 | 84.66 | 89.62 | 77.57 | 66.28 | 76.11 |
Coefficient of skewness | 0.91 | 0.84 | 0.94 | 0.39 | 0.63 | 1.33 |
Coefficient of variation | 0.38 | 0.34 | 0.35 | 0.36 | 0.27 | 0.36 |
Standard deviation | 23.45 | 28.99 | 31.72 | 27.73 | 17.81 | 27.73 |
Maximum value | 138.43 | 149.6 | 196.85 | 145 | 114.3 | 149.6 |
Minimum value | 19.3 | 45.72 | 38.61 | 23 | 38.1 | 40.89 |
Data collection years | 1962-2009 | 1961-2010 | 1963-2010 | 1963-2010 | 1963-2004 | 1969-2010 |
Data collection period | 44 | 46 | 44 | 44 | 38 | 38 |
The distribution of 24-hour maximum rainfall observed during different months of a year is shown in Figure 2. Figure 2 shows that Kalam and Besham Qilla received 42% and 21%, respectively, of observed rainfall in March. Oghi, Daggar and Puran received 37%, 32% and 23%, respectively, of observed rainfall in July. Mardan received 37% of observed rainfall in August. These results suggest that the maximum rainfall at these selected stations occurred between March and August.
Four probability distributions (normal, log-normal, log-Pearson type-III and Gumbel) were used in this study. The parameters of probability distributions were calculated using the method of moments and are given in Table 2.
Distribution | Parameters | Gauging stations | |||||
---|---|---|---|---|---|---|---|
Kalam | Oghi | Daggar | Mardan | Buran | Besham Qilla | ||
Normal | Sigma (δ) | 23.45 | 28.99 | 31.72 | 27.73 | 17.81 | 27.73 |
mu (μ) | 62.37 | 84.66 | 89.62 | 77.57 | 66.28 | 76.11 | |
Log-normal | Sigma (δ) | 0.375 | 0.326 | 0.345 | 0.144 | 0.261 | 0.322 |
mu (μ) | 4.06 | 4.38 | 4.44 | 5.24 | 4.16 | 4.28 | |
Gamma (γ) | 0 | 0 | 0 | -112.34 | 0 | 0 | |
Log Pearson type III | Alfa (α) | 51.24 | 67.56 | 13211 | 7.48 | 259.93 | 8.86 |
Beta (β) | -0.053 | 0.04 | -0.003 | -0.144 | 0.016 | 0.11 | |
Gamma (γ) | 6.78 | 1.68 | 44.53 | 5.36 | -0.099 | 3.305 | |
Gumbel max | Sigma (δ) | 18.28 | 22.61 | 24.73 | 21.62 | 13.88 | 21.62 |
mu (μ) | 51.82 | 71.61 | 75.34 | 65.09 | 58.27 | 63.64 |
The four probability distributions were subjected to three goodness of fit tests (Kolmogorov Smirnov Test, Chi-Squared Test and Anderson Darling Test) to determine the best-fitting probability distribution model at each rainfall gauging station. A standard procedure was followed for application of goodness of fit tests that was described earlier by several authors [16-18].
The goodness of fit tests was ranked from one (bestfit) to four (least-fit) for all probability distributions.
Selection of the best-fit probability distribution is based on the total score from all the goodness of fit tests. The results of goodness of fit tests at each selected rainfall gauging station and for each probability distribution used in this study are shown in Table 3. Based on the results of the goodness of fit tests, the best-fit probability distribution and mathematical expression for the calculation of rainfall in different return periods at each gauging station are shown in Table 4.
Station | Distribution model | Kolmogorov Smirnov test | Chi squared test | Anderson Darling test | Total |
---|---|---|---|---|---|
Kalam | Normal | 1 | 2 | 1 | 4 |
Log normal | 3 | 4 | 2 | 9 | |
Log Pearson type III | 4 | 3 | 3 | 10 | |
Gumbel | 2 | 1 | 4 | 7 | |
Oghi | Normal | 1 | 1 | 1 | 3 |
Log normal | 2 | 3 | 2 | 7 | |
Log Pearson type III | 4 | 4 | 4 | 12 | |
Gumbel | 3 | 2 | 3 | 8 | |
Daggar | Normal | 1 | 4 | 1 | 6 |
Log normal | 3 | 1 | 3 | 7 | |
Log Pearson type III | 4 | 3 | 4 | 11 | |
Gumbel | 2 | 2 | 2 | 6 | |
Mardan | Normal | 4 | 3 | 3 | 10 |
Log normal | 1 | 1 | 2 | 4 | |
Log Pearson type III | 3 | 2 | 4 | 9 | |
Gumbel | 2 | 4 | 1 | 7 | |
Puran | Normal | 3 | 1 | 1 | 5 |
Log normal | 2 | 4 | 3 | 9 | |
Log Pearson type III | 4 | 3 | 4 | 11 | |
Gumbel | 1 | 2 | 2 | 5 | |
Besham Qilla | Normal | 1 | 1 | 1 | 3 |
Log normal | 3 | 2 | 2 | 7 | |
Log Pearson type III | 4 | 4 | 4 | 12 | |
Gumbel | 2 | 3 | 3 | 8 |
Station | Best-fit distribution | Mathematical expression of best-fit distribution |
---|---|---|
Kalam | Log Pearson type III | Log (XT) = 1.77+0.17KT |
Oghi | Log Pearson type III | Log (XT) = 1.90+0.14KT |
Daggar | Log Pearson type III | Log (XT) = 1.93+0.15KT |
Mardan | Normal | XT = 77.57+27.73KT |
Puran | Log Pearson type III | Log (XT) = 1.81+0.12KT |
Besham Qilla | Log Pearson type III | Log (XT) = 1.86+0.14KT |
The normal distribution provides the best-fit at the Mardan rainfall gauging station, while log-Pearson type-III provides the best-fit at the other rainfall gauging stations analyzed in this study. Probability density functions (PDF) and cumulative distribution functions (CDF) at the rainfall gauging stations were calculated using the best-fit distribution, i.e., the normal distribution at Mardan and the log-Pearson type-III distribution at the rest of the rainfall gauging stations, and are shown in Figures 3 and 4.
The rainfall estimates or maximum values of expected rainfall (mm) for return periods of 2, 5, 10, 20, 50, 100 and 200 years at the rainfall gauging stations were calculated using the best-fit distribution. The rainfall estimates are given in Table 5.
Station | Best-fit distribution | Return period (Years) | ||||||
---|---|---|---|---|---|---|---|---|
2 | 5 | 10 | 20 | 50 | 100 | 200 | ||
Kalam | Log-Pearson III | 59.29 | 80.50 | 93.56 | 105.41 | 119.93 | 130.32 | 140.33 |
Oghi | Log-Pearson III | 79.15 | 105.31 | 123.23 | 140.88 | 164.52 | 182.93 | 201.95 |
Daggar | Log-Pearson III | 84.56 | 113.32 | 132.01 | 149.69 | 172.38 | 189.35 | 206.30 |
Mardan | Normal | 77.57 | 100.90 | 113.11 | 123.19 | 134.54 | 142.09 | 149.01 |
Puran | Log-Pearson III | 63.71 | 79.86 | 90.16 | 99.83 | 112.15 | 121.32 | 130.47 |
Besham Qilla | Log-Pearson III | 69.46 | 93.24 | 111.19 | 130.14 | 157.48 | 180.31 | 205.29 |
4 Conclusions
Annual maximum rainfall data based on a 24-hour duration at six rainfall-gauging stations in northern Pakistan were used in this study. The purpose of the study was to find the best-fit probability distributions at northern rainfall gauging stations. The maximum values of expected rainfall or rainfall estimates calculated using a probability distribution that does not provide the best-fit may yield values that are higher or lower than the actual values. These calculations may be used to influence decisions relating to local economics and hydrologic safety systems.
The normal distribution provided the best-fit probability distribution at the Mardan rainfall gauging station based on the scores of the goodness of fit tests used in this study. The log-Pearson type-III distribution is the best-fit probability distribution at the rest of the rainfall gauging stations. The expected values of designed rainfall or rainfall estimates calculated using the best-fit probability distributions at the rainfall gauging stations might be used by design engineers to safely and feasibly design hydrologic projects.
Acknowledgements
The project was financially supported by King Saud University, Vice Deanship of Research Chairs.
Conflict of Interest: The authors declare no conflict of interest.
References
Subudhi R., Probability analysis for prediction of annual maximum daily rainfall of Chakapada block of Kandhamal district in Orissa, Indian J. Soil Conser, 2007, 35, 84-85.Search in Google Scholar
Bhakar S. R., Iqbal M., Devanda M., Chhajed N., Bansal A. K., Probability analysis of rainfall at Kota, Indian J. Agri. Res, 2008, 42, 201-206.Search in Google Scholar
Tao D.Q., Nguyen V. T., Bourque A., On selection of probability distributions for representing extreme precipitations in Southern Quebec, Annual Conference of the Canadian Society for Civil Engineering, 5th-8th June 2002, 1-8.10.1061/40644(2002)250Search in Google Scholar
Upadhaya A., Singh S. R., Estimation of consecutive day’s maximum rainfall by various methods and their comparison, Indian J. Soil Conserv., 1998, 26, 193-201.Search in Google Scholar
Bhakar, S. R., Bansal A. N., Chhajed N., Purohit, R. C., Frequency analysis of consecutive days maximum rainfall at Banswara, Rajasthan, India, ARPN J. Engg. Appl. Sci, 2006, 1, 64-67.Search in Google Scholar
Barkotulla M. A. B., Rahman M. S., Rahman, M. M., Characterization and frequency analysis of consecutive days maximum rainfall at Boalia, Rajshahi and Bangladesh, J. Develop. Agri. Econ., 2009, 1, 121-126.Search in Google Scholar
Nemichandrappa M., Ballakrishnan P., Senthilvel S., Probability and confidence limit analysis of rainfall in Raichur region, Karnataka J. Agri. Sci., 2010, 23, 737-741.Search in Google Scholar
Manikandan M., Thiyagarajan G., Vijayakumar G., Probability analysis for estimating annual One day maximum rainfall in Tamil Nadu Agricultural University, Mad. Agri. J., 2011, 98 (1-3), 69-73.Search in Google Scholar
Vivekanandan N., Intercomparison of extreme value distributions for estimation of ADMR, Int. J. Appl. Engg. Technol., 2012, 2 (1), 30-37.Search in Google Scholar
Kazi S. A., Khan M. L., Variability of rainfall and its bearing on agriculture in the arid and semi-arid zones of West Pakistan, Pak. Geographic Rev., 1951, 6 (1), 40-63.Search in Google Scholar
FAO, Pakistan’s experience in rangeland rehabilitation and improvement, Food and Agriculture Organization of the UNO, 70, 1987.Search in Google Scholar
Khan J. A., The climate of Pakistan, Rehber Publishers, Karachi, Pakistan, 1993.Search in Google Scholar
Khan F. K., Pakistan geography, economy and people, Oxford University Press, Karachi, Pakistan, 2002.Search in Google Scholar
Kureshy K. U. Geography of Pakistan, National Book Service Lahore, Pakistan, 1998.Search in Google Scholar
Luo Q., Lin E., Agricultural vulnerability and adaptation in developing countries: the Asia-Pacific region, Climate Change, 1999, 43, 729-743.10.1023/A:1005501517713Search in Google Scholar
Chowdhury J. U., Stedinger J. R., Goodness of fit tests for regional generalized extreme value flood distributions, Water Res., 1991, 27 (7), 1765-1777.10.1029/91WR00077Search in Google Scholar
Adegboye O. S., Ipinyomi R. A., Statistical tables for class work and Examination, Tertiary publications Nigeria Limited, Ilorin, Nigeria, 1995, 5-11.Search in Google Scholar
Murray R.S., Larry J. S., Theory and problems of statistics, 3rd Edition, Tata Mc Graw – Hill Publishing Company Limited, New Delhi, India, 2000, 314-316.Search in Google Scholar
© 2016 M. T. Amin et al.
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.