1 Introduction

The South African financial environment is becoming progressively more competitive as pointed out by Van der Colff (2017). This is owing to an increase in globalization as well as a constant change in economic health. Due to the competitive nature of the environment, the time between an economic trough and peak has decreased over time. An economic trough is of particular interest to companies due to the potentially dire situation they might find themselves in. If companies are unable to meet their financial obligations, it increases the probability that they might default. Thus, developing a model to estimate the default intensity or default risk has become increasingly important. This is especially relevant given the impact that COVID-19 has had on the global economy, which is comparable to that of the Global Financial Crisis (GFC) (see for example Gunay (2020) for further elaboration).

The South African context provides a unique environment in which to estimate default intensity as it is a developing country which exhibits dynamic economic behaviour. The deputy governor of South Africa’s Reserve Bank, Dr Rashad (see Cassim, 2021), explained the economic status before COVID-19 as a ”technical recession combined with mounting fiscal stress, growing public sector debt, and a downgrade to below investment grade status in March 2020”. In light of the aforementioned factors, as well as the widespread effect that the COVID-19 pandemic has had on the overall economy, South African treasury bonds have an increasing probability of defaulting. For this reason, estimating default intensity, specifically in the South African context, is of paramount importance. Even though South Africa is much riskier than investment-grade countries, its bonds are the most desirable out of the bonds offered by other rated below investment grade countries. Therefore, investors looking for high yields will surely look to South African bonds first. This indicates that despite the pandemic, the South African economy is among the top performers in the Emerging Markets (EM) according to empirical evidence documented by Chi (2021).

In addition to the estimation of default intensity in South African financial market, this work also compares two interest rate models, namely, the Cox–Ingersoll–Ross model (CIR) originally proposed by Cox et al. (1985) and the \(\alpha\)-CIR model by Jiao et al. (2017) which both fall under the affine class of models (Duffie et al., 2003). The CIR model is more popular and has been commonly used in recent history due to its restriction of interest rates to the positive domain. \(\alpha\)-CIR is an extension of Cox et al. (1985) to capture jumps and low interest rates. Until recently, interest rates have been considered to be exclusively positive, not considering outliers (Cox et al., 1981). Furthermore, both the CIR and the \(\alpha\)-CIR models have a mean-reverting property which further captures the movements of observed interest rates in the financial market.

Hull (2015) details the alternative equilibrium interest rate models. These include the Rendelman-Bartter model as well as the Vasicek model (see Rendleman & Bartter, 1980; Vasicek, 1977). These models possess undesirable properties in modeling interest rates such as the modelling of interest rates similarly to equity prices as well as allowing for negative interest rates. The CIR model provides an improvement on these models.

It has been thought that the CIR’s constraint to model exclusively positive interest rates was an advantage, however, recently, and especially in the COVID-19 crisis, it is not always the ideal choice to model negative interest rates. Brown and Dybvig (1986) as well as Gibbons and Ramaswamy (1993) found additional drawbacks to the use of the CIR. These include the overestimation of short rates and its sensitivity to parameter choices. The overestimation of short rates affects bond pricing. The \(\alpha\)-CIR model, was developed to remedy the drawbacks of the original CIR model. This model is able to model continuously low interest rates as well as sizeable jumps (Jiao et al., 2017). It is especially relevant to model low interest rates during the period of the COVID-19 pandemic. Thus, this paper compares performance of the CIR model to that of the \(\alpha\)-CIR model during the periods before COVID-19 and the period during the pandemic for robustness check.

In this paper and following from Cox et al. (1981), a three factor model is adopted. In allowing for the extension from a one factor to a three factor model, the model accuracy and flexibility is improved. This is owing to the three factor model incorporating variation in the short and long term interest rates.

The extended Kalman filter is employed. Chen and Scott (2003) describe that the innovations for the Kalman filter are linear. The Kalman filter would thus only provide unbiased estimates to the parameters of the unobservable factors of the bond yields if the components of the risk free rate (state variables) are normally distributed. The innovations for the state variables relating to the CIR and \(\alpha\)-CIR are not normally distributed, and thus an extension to the Kalman filter is required.

To solve the estimation problem efficiently two steps are followed. Firstly, a way of estimating the unobservable factors affecting bond yields need to be developed. Secondly, a way of estimating the stochastic default intensity need to be developed. The solution to these two problems forms the two stages of the estimation procedure. The latter problem forms the basis for the ultimate purpose of this paper.

The literature concerning the topic of estimating the probability of default of corporate bonds dates back to the 1970s when Merton (1974) was the first to develop a theory for pricing bonds where there are significant probabilities of default. He called this theory, ”the risk structure of interest rates” and makes use of Black-Scholes methodology when pricing these bonds. The theory developed by Merton links the probability of a firm defaulting to the volatility in its asset value. This approach was found to be impractical according to Duffee (1999). For this approach to be feasible in practice, the boundary conditions must include the conditions under which default occurs and the division of the claimant of the firm, given that a default had occurred. However, the conditions that are supposed to be incorporated into the boundary conditions are often too complex to quantify. For this reason, other methodologies had to be developed.

Our paper is related to the literature on default risk estimation such as Duffie and Singleton (1999), Litterman and Iben (1991), Jarrow et al. (2001), Lando (1998), Duffie (1998), Madan and Unal (1998) as well as Jarrow and Turnbull (1995), who estimated the default intensity parameter by modeling it as a stochastic variable through time in a reduced-form approach. Duffie and Singleton (1999) called this methodology a reduced-form model since the probability of default is linked to the asset value although not directly. The models developed were proven to be mathematically tractable. But it was only in a paper by Duffee (1999) that they were shown in a general way to have the ability to successfully model the behaviour of individual corporate bond prices.

The model framework adopted by Duffee (1999) was similar to those adopted by Madan and Unal (1998) and made a few assumptions. Firstly, the yields were chosen to be modeled according to the affine term structure. Secondly, the affine term structure of the yields is linked to the square root diffusion process (CIR model to be exact) that was chosen to model the interest rates and the default intensities. Lastly, the extended Kalman filter was chosen to estimate the parameter values.

The choice of the affine term structure for the yield corresponds to the choice of the CIR model for the interest rates and the default intensity. The theory used to justify Duffee’s choice of this model was taken from a paper by Cox et al. (1985). The theory that can be used to justify Duffee’s choice of the extended Kalman filter as the method for parameter estimation can be found in a paper by Chen and Scott (2003) where they explore the effectiveness of the extended Kalman filter to estimate the parameters of a multi-factor CIR model for interest rates.

1.1 Our Approach

We model default risk with affine representations of state variables from corporate bonds for major banks in South Africa following the approach by Duffee (1999) and Driessen (2004). More recent literature was found that criticises the CIR model for certain shortcomings in the way that it does not capture low-interest rates and jump behaviours (see for instance Mendoza-Arriaga and Linetsky (2014) for further motivation of applying jump processes in default risk estimation). An extension to the original CIR model (so-called the \(\alpha\)-CIR model) was proposed by Jiao et al. (2017) to rectify these shortcomings. In this paper, a multi-factor \(\alpha\)-CIR is implemented for econometric estimation and compared to the original CIR model. We offer some modest contributions to these stochastic default risk estimation methodologies originally proposed by Duffee (1999) by implementing the model using corporate bond data from four major South African Banks in Rand currency. The bond data used comes from First Rand, Nedbank, Standard Bank and African Development Bank. In addition to the unique data, we are the first to implement empirical studies to show that the \(\alpha\)-CIR model consistently outperforms the multi-factor CIR. We exploited the extended Kalman filter to estimate and accommodate ”hidden” state variables for default risk in the South African financial market. Empirical results show that default risk in South Africa increased sharply during COVID-19, and this points toward the application of advanced stochastic models to manage default risk.

1.2 Outline of Paper

The paper is organised as follows. Section 2 introduces the stochastic models that will be applied for the purpose of stochastic default intensity estimation. Section 3 describe the data and discusses the estimation procedure of the mathematical models and algorithms to be implemented. This section ends with a discussion of the numerical results. Section 4 concludes.

2 Bond Pricing Models

2.1 Model Variables

Consider an economy indexed by the fixed time horizon \(T^*\in [0,\infty )\). In this economy, there is a d-dimensional vector of state variables Y representing the systematic risk factors. We fix a probability space \((\Omega ,{\mathcal {F}},{\mathbb {P}})\) equipped with an information filtration \(\{{\mathcal {F}}_t: t\ge 0\}\), that specifies for each time t the set \({\mathcal {F}}_t\) of events that are observable at that time. Here, the set \(\Omega\) contains the possible states of the world and \({\mathcal {F}}\) consists of the subsets of \(\Omega\), called “events,” to which a probability can be assigned. The probability measure \({\mathbb {P}}: {\mathcal {F}}\rightarrow {\mathbb {R}}\) assigns a probability \({\mathbb {P}}(A)\) to each event A (see for instance Shreve, 2008).

Given a stopping time \(\tau\), we say that a non-negative hazard rate process \(h_t\) is the intensity of \(\tau\) if

$$\begin{aligned} M_t={\mathbb {I}}_{\{\tau \le t\}}-\int _0^th_s{\mathbb {I}}_{\{\tau > s\}}ds \end{aligned}$$

is a martingale.

A stopping time \(\tau\) with intensity h(t) is doubly-stochastic, driven by Y, if \(\tau\) has Poisson property.

Applying the law of iterated expectations we get

$$\begin{aligned} {\mathbb {P}}(\tau >t|X)=e^{-\int _0^th(s)ds}={\mathbb {E}}\left[ e^{-\int _0^th(s)ds}\right] ,\quad t\le T^*, \end{aligned}$$
(2.1)

where \({\mathbb {E}}(\cdot )\) denotes the expectation under \({\mathbb {P}}\) conditional on \({\mathcal {F}}_t\).

The default-free bond price is given byFootnote 1

$$\begin{aligned} P(t,T)={\mathbb {E}}_t^{\mathbb {Q}}\left[ \exp \left( -\int _t^T r(s)ds\right) \right] , \end{aligned}$$
(2.2)

where \(r_t\) is the (risk-less) instantaneous spot rate.

The reduced-form pricing of defaultable bond price \(P^d\) with zero recovery can be expressed as

$$\begin{aligned} P^d(t,T)={\mathbb {I}}_{\{\tau >t\}}{\mathbb {E}}_t^{\mathbb {Q}}\left[ \exp \left( -\int _t^T (r(s)+{\tilde{h}}(s))ds\right) \big |{\mathcal {G}}_t\right] , \end{aligned}$$
(2.3)

where \({\tilde{h}}(t)\) is the default intensityFootnote 2 under the risk-neutral measure \({\mathbb {Q}}\) (see Schönbucher, 1998). Here \({\mathcal {G}}_t={\mathcal {F}}_t\vee {\mathcal {H}}_t\), with \(({\mathcal {H}}_t)_{0\le t\le T}\) denoting the filtration generated by the default indicator process \({\mathbb {I}}_{\tau \le t}.\)

Our goal is to estimate defaultable bond in Eq. (2.3). We consider the risk-neutral default intensity \({\tilde{h}}_t\) as an additional factor whose dynamics follow

$$\begin{aligned} {\tilde{h}}(s) = \rho (r(s)- {\bar{r}}) + {\hat{h}}(s). \end{aligned}$$
(2.4)

This setting inspired by Duffee (1999) is used to capture the correlation between the default-free factor r and the default intensity \({\tilde{h}}\). \(\rho\) is treated as a correlation parameter and \({\bar{r}}\) is the sample mean of r.

2.2 Stochastic Modelling

2.2.1 CIR Model

We assume both \(r_t\) and \({\hat{h}}_t\) are given as a sum of three independent stochastic factors \(y_{i}\) for \(i=1,2,3\), i.e.

$$\begin{aligned} r(t)&=\sum _i^3 y_{i}^r(t) \end{aligned}$$
(2.5)
$$\begin{aligned} {\hat{h}}(t)&=\sum _i^3 y_{i}^{{\hat{h}}}(t), \end{aligned}$$
(2.6)

where each \(y_i^r\) and \(y_i^{{\hat{h}}}\) follow the Cox–Ingersoll–Ross (CIR)Footnote 3 dynamics under the pricing measure, i.e.Footnote 4

$$\begin{aligned} dy_i^r(t)&=\kappa _i^r(\theta _i^r-r_i(t))dt+\sigma _i^r\sqrt{y_i(t)}dW_i^{{\mathbb {P}} r}(t) \end{aligned}$$
(2.7)
$$\begin{aligned} dy_i^{{\hat{h}}} (t)&=\kappa _i^h(\theta _i^h-{\hat{h}}(t))dt+\sigma _i^h\sqrt{{\hat{h}}(t)}dW_i^{{\mathbb {P}}{\hat{h}}}(t), \end{aligned}$$
(2.8)

where \(dW_i^{{\mathbb {P}} r}(t)\quad (i=1,2,3)\) and \(dW_i^{{\mathbb {P}} {\hat{h}}}(t)\quad (i=1,2,3)\) are independent Wiener processes.Footnote 5 The CIR-type model is chosen for its analytical tractability, as well as the guaranteed positivity of the modelled object, with the condition which ensures that the origin is inaccessible.

The market price of risk (MPRFootnote 6) is modelled with complete affine specifications (see Doran and Ronn (2008), Jarrow et al. (2001) and Dai and Singleton (2002)) and more specifically as

$$\begin{aligned} dW_i^{{\mathbb {P}} r}(t)&=dW_i^{{\mathbb {Q}} r}(t)-\frac{\lambda _i^r}{\sigma _i^r}\sqrt{y_i^r(t)}dt, \end{aligned}$$
(2.9)
$$\begin{aligned} dW_i^{{\mathbb {P}}{\hat{h}}}(t)&=dW_i^{{\mathbb {Q}}{\hat{h}}}(t)-\frac{\lambda _i^h}{\sigma _i^h}\sqrt{y_i^{{\hat{h}}}(t)}dt, \end{aligned}$$
(2.10)

where \(dW_i^{{\mathbb {Q}} r}(t)\) and \(dW_i^{{\mathbb {Q}}{\hat{h}}}(t)\) are Wiener processes under the risk-neutral measure \({\mathbb {Q}}.\), i.e., Under \({\mathbb {Q}}\), Eqs. (2.7)–(2.8) becomes

$$\begin{aligned} dy_i^r(t)&=(\kappa _i^r\theta _i^r-(\kappa _i^r+\lambda _i^r)r_i(t))dt+\sigma _i^r\sqrt{y_i(t)}dW_i^{{\mathbb {Q}} r}(t) \end{aligned}$$
(2.11)
$$\begin{aligned} dy_i^{{\hat{h}}} (t)&=(\kappa _i^h\theta _i^h-(\kappa _i^h+\lambda _i^h){\hat{h}}(t))dt+\sigma _i^h\sqrt{{\hat{h}}(t)}dW_i^{{\mathbb {Q}}{\hat{h}}}(t). \end{aligned}$$
(2.12)

The coefficients \(\kappa _i^r, \kappa _i^h, \theta _i^r, \rho _i, \theta _i^h, \sigma _i^r\), \(\sigma _i^h\), \(\lambda _i^r\), and \(\lambda _i^h\) are positive constants in order to keep the model analytically tractable.

The following proposition helps us to price bonds in an affine interest rate term structure framework.

Proposition 2.1

Let y be a d-dimensional Cox–Ingersoll–Ross (CIR) process, i.e.

$$\begin{aligned} dy_{i}(t) = \kappa _{i}(\theta _{i} -y_{i}(t)) dt + \sigma _{i}\sqrt{y_{i}(t)} dW_{i}(t), \end{aligned}$$
(2.13)

where \(W_{i}\) (\(i = 1,.., d\)) are independent Wiener processes under the probability measure \({\mathbb {P}}\). If for all \(i=1,..,d\)

$$\begin{aligned} \mu&\ge -\frac{\kappa _i^2}{2\sigma _i^2}, \end{aligned}$$
(2.14)

hence for all \(t\ge 0\) and \(i=1,..,d\)

$$\begin{aligned} {\mathbb {E}}^{\mathbb {Q}}\left[ e^{-\mu \int _0^t y_i(s)ds}\right] =&e^{\Phi _i(0,t,\mu )+y_i(0)\Psi _i(0,t,\mu )}, \end{aligned}$$
(2.15)

where

$$\begin{aligned} \Phi _i(0,t,\mu )=&-\frac{2\theta _i\kappa _i}{\sigma _i^2} \ln \left( \frac{\kappa _i(e^{\sqrt{A_i}t}-1)+\sqrt{A_i}(e^{\sqrt{A_i}t}+1)}{2\sqrt{A_i}e^{\frac{\sqrt{A_i}+\kappa _i}{2}t}}\right) , \end{aligned}$$
(2.16)
$$\begin{aligned} \Psi _i(0,t,\mu )=&\frac{-2\mu (e^{\sqrt{A_i}t}-1)}{\kappa _i(e^{\sqrt{A_i}t}-1)+\sqrt{A_i}(e^{\sqrt{A_i}t}+1)}, \end{aligned}$$
(2.17)

with \(A_i=\kappa _i^2+2\mu \sigma _i^2\).

Proof

See Lamberton and Lapeyre (1996). \(\square\)

Next we apply Proposition 2.1 to compute bond prices in Eqs. (2.2) and (2.3):

$$\begin{aligned} P(t,T)&={\mathbb {E}}_t^{\mathbb {Q}}\left[ \exp \left( -\int _t^T r(s)ds\right) \right] ={\mathbb {E}}_t^{\mathbb {Q}}\left[ \exp \left( -\int _t^T \sum _i^3 y_{i}^r(s)ds\right) \right] \end{aligned}$$
(2.18)
$$\begin{aligned}&=\prod _{i=1}^{3}{\mathbb {E}}_t^{\mathbb {Q}}\left[ e^{-\int _t^Ty_i^r(s)ds}\right] =e^{\sum _{i=1}^{3}(\Phi _i^r(t,T,1)+y_i^r(t)\Psi _i^r(t,T,1))}. \end{aligned}$$
(2.19)

The defaultable bond price is given by

$$\begin{aligned} P^d(t,T)&={\mathbb {E}}_t^{\mathbb {Q}}\left[ \exp \left( -\int _t^T (r(s)+h(s))ds\right) \right] \end{aligned}$$
(2.20)
$$\begin{aligned}&={\mathbb {E}}^{\mathbb {Q}}\left[ \exp \left( -\int _t^{T} (r_s+\rho (r_s-{\bar{r}})+{\hat{h}}_s)ds\right) \right] \end{aligned}$$
(2.21)
$$\begin{aligned}&=e^{\rho {\bar{r}}(T-t)}\times {\mathbb {E}}^{\mathbb {Q}}\left[ \exp \left( -(1+\rho )\int _t^{T} r_sds\right) \right] {\mathbb {E}}^{\mathbb {Q}}\left[ \exp \left( -\int _t^{T}{\hat{h}}_s ds\right) \right] \end{aligned}$$
(2.22)
$$\begin{aligned}&=\exp \left\{ \sum _{i=1}^{3}(\rho _i{\bar{r}}_i(T-t)+\Phi _i^r(t,T,(1+\rho _i))+y_i^r(t)\Psi _i^r(t,T,(1+\rho _i)))\right\} \end{aligned}$$
(2.23)
$$\begin{aligned}&\quad \times \exp \left\{ \sum _{i=1}^{3}(\Phi _i^h(t,T,1)+y_i^h(t)\Psi _i^h(t,T,1))\right\} \end{aligned}$$
(2.24)

Under the no-arbitrage principle, the risk-neutral dynamics (2.12) support the affine yield structure

$$\begin{aligned} Y(t,T)=-\frac{\ln P(t,T)}{T-t}=G_{0}^r(T-t)+{\textbf{G}}_{1}^r(T-t)^\top \mathbf {y^r}(t), \end{aligned}$$
(2.25)

with

$$\begin{aligned} G_{0}^r(\tau )&=-\frac{1}{\tau }\sum _{i=1}^{3}\Phi _i^r(0,\tau ,1,0) \end{aligned}$$
(2.26)
$$\begin{aligned} {\textbf{G}}_{1}^r(\tau )&=\begin{pmatrix} -\frac{1}{\tau }\Psi _1^r(0,\tau ,1,0)&{} -\frac{1}{\tau }\Psi _2^r(0,\tau ,1,0)&{} -\frac{1}{\tau }\Psi _3^r(0,\tau ,1,0)\\ \end{pmatrix} \end{aligned}$$
(2.27)
$$\begin{aligned} {\textbf{y}}^r(t)&=\begin{pmatrix} y_1^r(t)&{} y_2^r(t)&{} y_3^r(t)\\ \end{pmatrix}. \end{aligned}$$
(2.28)

It should be emphasised here that the parameters for \(\Phi _i^r\) are \((\kappa _i^r,\theta _i^r,\sigma _i^r,\lambda _i^r)\) while for \(\Psi _i^h\) are indexed by h, i.e., \((\kappa _i^h,\theta _i^h,\sigma _i^h,\lambda _i^h)\).

2.2.2 \(\alpha\)-CIR Model

The \(\alpha\)-CIR model is an extension of the above-mentioned CIR model. Duffee (1999) models the state variables by using the CIR process but since then, the \(\alpha\)-CIR model has been found to make up for the drawbacks of the original CIR model.

Jiao et al. (2017) explains that this model enables continuous low-interest rates (such as were observed in the South African financial market, during the COVID-19 pandemic) as well as sizeable jumps. Typically, the low-interest rates and large jumps do not occur simultaneously. In order to allow for these previously-considered rare occurrences in interest rates, the \(\alpha\)-CIR model is based on a regime-switching framework. The jump behaviour as well as the fatter tails are characterised by an added parameter, \(\alpha\), which is incorporated by adding an \(\alpha\)-stable Lévy process beside the geometric Brownian motion in the original CIR representation.

The \(\alpha\)-CIR process \(y_t\) satisfies the following stochastic differential equation (SDE),

$$\begin{aligned} dy(t)=\kappa (\theta -y(t))dt+\sigma \sqrt{y(t)}dB(t) +\sigma _Z\root \alpha \of {y({t^{-}})}dZ(t), \end{aligned}$$
(2.29)

where \(\theta , \sigma , \sigma _Z\) are non-negative and \(\kappa \in {\mathbb {R}}\). The process \(B=(B(t),t>0)\) is a Brownian motion and \(Z=(Z(t), t>0)\) is a spectrally positive \(\alpha\)-stable compensate Lévy process with parameter \(\alpha \in (1, 2]\) whose Laplace transform is given is given by

$$\begin{aligned} {\mathbb {E}}[e^{-qZ(t)}]=\exp \left\{ -\frac{t q^\alpha }{\cos (\pi \alpha /2)}\right\} ,\quad q\ge 0. \end{aligned}$$

The process W and Z are assumed to be independent. The departure of the process Z from Brownian motion is controlled by the tail index \(\alpha\). When \(\alpha <2\), Z is a pure jump process with heavy tails.

As in Sect. 2.2.1, we also assume the instantaneous spot rate r and the default intensity h are driven by a sum of three independent state variables evolving according to the stochastic differential equation (2.29).

We have assumed that model dynamics (2.29) are specified under a risk-neutral probability \({\mathbb {Q}}\). However, it is important to establish a link with the physical measure denoted by \({\mathbb {P}}\). The following proposition enables us to change measures from \({\mathbb {Q}}\) to \({\mathbb {P}}\) for \(\alpha\)-CIR dynamics (see also Jiao et al., 2021).

Proposition 2.2

Fix \(T^{\star }\) large enough. Let y be an \(\alpha\)-CIR \((\kappa ,\theta ,\sigma ,\sigma _{Z},\alpha )\) process defined in (2.29) under the risk-neutral measure \({\mathbb {Q}}\) and assume that the filtration \(({\mathcal {F}}_t)_{t>0}\) is generated by the random fields W and \({\widetilde{N}}\). Fix \(\lambda _1 \in {\mathbb {R}}\) and \(\lambda _2\in {\mathbb {R}}_{+}\), and define for \(t\in [0, T^*]\)

$$\begin{aligned} U_t=\lambda _1\int _0^t\int _0^{y_s}W(ds,du)+\int _0^t\int _0^{y_{s^-}}\int _0^\infty (e^{-\lambda _2\zeta }-1){\widetilde{N}}(ds,du,d\zeta ). \end{aligned}$$
(2.30)

Then the Doléans-Dade exponential \({\mathcal {E}} (U)\) is a martingale on \([0, T^{\star }]\) and the probability measure \({\mathbb {Q}}\) defined by \(\frac{d{\mathbb {Q}}}{d{\mathbb {P}}} \biggm |_{{\mathcal {F}}_{T^{\star }}} = {\mathcal {E}}(U)_{T^{\star }}\) is equivalent to \({\mathbb {P}}\). Moreover, r is under \({\mathbb {P}}\) an \(\alpha\)-CIR process as with the parameters \((\kappa ^{\mathbb {P}},\theta ^{\mathbb {P}},\sigma ^{\mathbb {P}},\sigma _{Z}^{\mathbb {P}})\), where

$$\begin{aligned} \kappa ^{\mathbb {P}}=\kappa -\sigma \lambda _1-\frac{\alpha \sigma _{Z}}{\cos (\pi \alpha /2)} \lambda _2^{\alpha -1}, \qquad \theta ^{\mathbb {P}}=\kappa \theta /\kappa ^{\mathbb {P}}, \qquad \sigma ^{\mathbb {P}}=\sigma ,\qquad \sigma _{Z}^{\mathbb {P}}=\sigma _{Z}. \end{aligned}$$

Proof

See Jiao et al. (2017, 2021). \(\square\)

In the following, we give a proposition of a useful transform that will be necessary for the computation of bond prices under \(\alpha\)-CIR model.

Proposition 2.3

Let y be a \(\alpha\)-CIR process given in Eq. (2.29) with \(y(0)=y_0\). For \(\mu >0\), we have

$$\begin{aligned} {\mathbb {E}}^{\mathbb {Q}}\left[ e^{-\mu \int _0^t y(s) ds}\right] =\exp \left( \phi (t,\mu )+y_0\psi (t,\mu )\right) , \end{aligned}$$
(2.31)

where \(\phi\) and \(\psi\) are the solutions to the generalized Ricatti equations

$$\begin{aligned} \frac{\partial \phi (t,\mu )}{\partial t}&=F(\psi (t,\mu ),\mu ), \quad \phi (0,\mu )=0; \end{aligned}$$
(2.32)
$$\begin{aligned} \frac{\partial \phi (t,\mu )}{\partial t}&=R(\psi (t,\mu ),\mu ),\quad \psi (0,\mu )=0. \end{aligned}$$
(2.33)

Moreover, the functions F and \(R: i{\mathbb {R}} \times {\mathbb {C}}_-^2\rightarrow {\mathbb {R}}\) are defined by

$$\begin{aligned} F(u,\mu )&=\kappa \theta u, \end{aligned}$$
(2.34)
$$\begin{aligned} R(u,\mu )&=-\mu -\kappa u+\frac{\sigma ^2}{2}u^2-\frac{\sigma _Z^\alpha }{\cos (\pi \alpha /2)}(-u)^\alpha . \end{aligned}$$
(2.35)

Proof

See Jiao et al. (2017). \(\square\)

As in Sect. 2.2.1, we assume both \(r_t\) and \({\hat{h}}_t\) are given as a sum of three independent stochastic factors \(y_{i}\) for \(i=1,2,3\), i.e.

$$\begin{aligned} r(t)&=\sum _i^3 y_{i}^r(t) \end{aligned}$$
(2.36)
$$\begin{aligned} {\hat{h}}(t)&=\sum _i^3 y_{i}^{{\hat{h}}}(t), \end{aligned}$$
(2.37)

where the \(y_i^r\) and \(y_i^{{\hat{h}}}\) follow the \(\alpha\)-Cox–Ingersoll–Ross (CIR) dynamics under the pricing measure,

$$\begin{aligned} dy_i^r(t)&=\kappa _i^r(\theta _i^r-y_i^r(t))dt+\sigma _i^r\sqrt{y(t)}dB_i^r(t) +\sigma _{Zi}^r\root \alpha _i^r \of {y_i^r({t^{-}})}dZ_i^r(t), \end{aligned}$$
(2.38)
$$\begin{aligned} dy_i^{{\hat{h}}}(t)&=\kappa _i^{{\hat{h}}}(\theta _i^{{\hat{h}}}-y_i^{{\hat{h}}}(t))dt+\sigma _i^{{\hat{h}}}\sqrt{y(t)}dB_i^{{\hat{h}}}(t) +\sigma _{Zi}^{{\hat{h}}}\root \alpha _i^{{\hat{h}}} \of {y_i^{{\hat{h}}}({t^{-}})}dZ_i^{{\hat{h}}}(t), \end{aligned}$$
(2.39)

where \(dB_i^r(t), dZ_i^r(t), dZ_i^{{\hat{h}}}(t)\quad (i=1,2,3)\) and \(dB_i^{{\hat{h}}}(t)\quad (i=1,2,3)\) are independent Wiener processes.

Example if r(t) is given as a sum of the factor \(\alpha\)-CIR model in Eq. (2.29), with parameters \((\kappa _i^r,\theta _i^r,\sigma _i^r,\sigma _{Zi}^r,\alpha _i^r)\) for \(i=1,2,3\) under the equivalent risk-neutral probability measure \({\mathbb {Q}}\) then the zero-coupon bond price is given by setting \(\mu =1\) in Proposition 2.3:

$$\begin{aligned} \begin{aligned} P(t,T) = \prod _{i=1}^3\exp {\left( \phi _i^r(T-t,1)+y_i(t)^r \psi _i^r(T-t,1)\right) }, \end{aligned} \end{aligned}$$
(2.40)

where \(\phi _i^r\) and \(\psi _i^r\) solve the generalized Riccati equations

$$\begin{aligned} \frac{\partial \phi _i^r(t,p)}{\partial t}&=F_i(\psi _i^r(t,p),p), \quad \phi _i^r(0,p)=0,\quad \forall i;\end{aligned}$$
(2.41)
$$\begin{aligned} \frac{\partial \phi _i^r(t,p)}{\partial t}&=R_i(\psi _i^r(t,p),p),\quad \psi _i^r(0,p)=0,\quad \forall i, \end{aligned}$$
(2.42)

where

$$\begin{aligned} F_i(u,p)&=\kappa _i^r\theta _i^r u \end{aligned}$$
(2.43)
$$\begin{aligned} R_i(u,p)&=-p-\kappa _i^r u+\frac{(\sigma _i^r)^2}{2}u^2-\frac{(\sigma _{Zi}^r)^{\alpha _i^r}}{\cos (\pi \alpha _i^r/2)}u^{\alpha _i^r}. \end{aligned}$$
(2.44)

From the zero-coupon \(\alpha\)-CIR representation of the bond price, the zero-coupon \(\alpha\)-CIR yield can be determined exactly as given in Eq. (2.25).

The defaultable bond price under the \(\alpha\)-CIR model is given by

$$\begin{aligned} P^d(t,T)&={\mathbb {E}}_t^{\mathbb {Q}}\left[ \exp \left( -\int _t^T (r(s)+h(s))ds\right) \right] \end{aligned}$$
(2.45)
$$\begin{aligned}&=e^{\rho {\bar{r}}(T-t)}\times {\mathbb {E}}^{\mathbb {Q}}\left[ \exp \left( -(1+\rho )\int _t^{T} r_sds\right) \right] {\mathbb {E}}^{\mathbb {Q}}\left[ \exp \left( -\int _t^{T}{\hat{h}}_s ds\right) \right] \end{aligned}$$
(2.46)
$$\begin{aligned}&=\exp \left\{ \sum _{i=1}^{3}\left( \rho _i{\bar{r}}_i(T-t)+\phi _i^r(T-t,1+\rho _i)+y_i(t)^r \psi _i^r(T-t,1+\rho _i) \right) \right\} \end{aligned}$$
(2.47)
$$\begin{aligned}&\quad \times \exp \left\{ \sum _{i=1}^{3}\phi _i^{{\hat{h}}}(T-t,1)+y_i(t)^{{\hat{h}}} \psi _i^{{\hat{h}}}(T-t,1)\right\} , \end{aligned}$$
(2.48)

where \(\phi _i^r (s,p)\), and \(\psi _i^r (s,p)\) are solutions to Eqs. (2.41)–(2.44).Footnote 7\(\phi _i^{{\hat{h}}} (s,p)\) and \(\psi _i^{{\hat{h}}} (s,p)\) are the unique solutions to

$$\begin{aligned} \frac{\partial \phi _i^{{\hat{h}}}(t,p)}{\partial t}&=F_i^{{\hat{h}}}(\psi _i^{{\hat{h}}}(t,p),p), \quad \phi _i^{{\hat{h}}}(0,p)=0,\quad \forall i; \end{aligned}$$
(2.49)
$$\begin{aligned} \frac{\partial \phi _i^{{\hat{h}}}(t,p)}{\partial t}&=R_i^{{\hat{h}}}(\psi _i^{{\hat{h}}}(t,p),p),\quad \psi _i^{{\hat{h}}}(0,p)=0,\quad \forall i, \end{aligned}$$
(2.50)

where

$$\begin{aligned} F_i^{{\hat{h}}}(u,p)&=\kappa _i^{{\hat{h}}}\theta _i^{{\hat{h}}} u \end{aligned}$$
(2.51)
$$\begin{aligned} R_i^{{\hat{h}}}(u,p)&=-p-\kappa _i^{{\hat{h}}} u+\frac{(\sigma _i^{{\hat{h}}})^2}{2}u^2-\frac{(\sigma _{Zi}^{{\hat{h}}})^{\alpha _i^{{\hat{h}}}}}{\cos (\pi \alpha _i^{{\hat{h}}}/2)}u^{\alpha _i^{{\hat{h}}}}. \end{aligned}$$
(2.52)

3 Empirical Research

The South African government bond data as well as the corporate data from four of South Africa’s largest banks was retrieved from Bloomberg’s database.

Two different types of data were used in the implementation namely the treasury bond data (used in the estimation of the risk-free component) and corporate bond data of the four major banks in South Africa (used in the estimation of the default risk component). A detailed explanation follows.

3.1 Data and the Methodology

3.1.1 Government Bond Data

The South African government bond dataFootnote 8 was used as a proxy for risk-free rate for which to estimate the default-free process. Daily bond yields from 22/10/2015 to 10/05/2021 with maturities of 2, 6, 10, and 15 years were used.Footnote 9 As stated in the introduction of this paper, government bonds of developing nations are generally too risky to be used in the estimation of the default-free process. However, in the South African bond context, it is one of the nearest to risk-free proxies that can be achieved. It was also stated that South African bonds trade in the speculative investment category indicating that there is a considerable probability of default. It should thus be noted that the estimate of the three-factor risk-free rate cannot be expected to be accurate. The government bond data used in the estimation of default-free parameters were the bond yields with maturities of 2, 6, 10, and 15 years from 22/10/2015 to 10/05/2021. Table 1 gives the descriptive statistics of the long-end of the South African government bond yields. Our data covers a more recent important event, the Global COVID-19 pandemic. We, therefore, break the data further down into smaller subsamples and analyse their statistical features as depicted in Table 1. There is a regime shift observed at the onset of the pandemic. As seen from Table 1, the short-end of the yield curve drastically declined during the COVID-19 period. Note that the standard deviation of the 6- and 10-year bond yields estimated in the period of COVID-19 0.896% and 0.683% respectively which is almost double that (0.399% and 0.336% resp.) of the corresponding yields before COVID-19. The reason for this seemingly higher volatility is due to the market uncertainty as a result of COVID-19. To better quantify the volatility of the interest rates during those two periods, we also present the linearly as well as nonlinearly detrended standard deviation of the interest rates. The linearity and nonlinearity are removed by subtracting a least-squares polynomial fit of degree 1 and degree 2 respectively. For the two statistics of skewness (asymmetry) and kurtosis (leptokurtic), we can observe that the bond yields used in our study are characterised by non-normal distributions. The positive signs of the skewness coefficients indicate that the bond yields are skewed to the right and are far from being symmetric for all bond yields during COVID-19 period. Also, the kurtosis coefficients confirm that the leptokurtic distributions for all yields used in this study show the existence of a high peak or a fat tails in their volatilities.

Table 1 Descriptive statistics of South African government yields

Owing to the wide-ranging impact that COVID-19 has had on the overall economy, an estimation of the parameters was drawn before the pandemic as well as during (Chi, 2021). The COVID-19 pandemic starting date was taken as the \(5^{th}\) of March 2020, which is considered to be the date that the virus entered South Africa. This allows for just over three weeks of data prior to South Africa’s nationwide lockdown. These dates were selected in order to be able to assess the full impact that the pandemic has had on bond prices and to clearly judge which model reacted best to the impact the pandemic had on the market. The data that were available in the time period before the pandemic is large enough to acquire meaningful results when the model fit it.

To assess the impact of the COVID-19 pandemic we split our data sample into two subperiods. First, the estimation is performed for the period 22/10/2015–05/03/2020 before the global lockdown, 05/03/2020–10/05/2021 during the pandemic, and for the whole period, i.e., from 22/10/2015 to 10/05/2021.

Figure 1, below, shows the South African bond yields for 2-, 6-, 10-, and 15-years to maturity. A regime shift is observable at the start of the COVID-19 pandemic where the South African government bond yields experienced a sudden increase followed by a sharp decline in the interest rates. The spread between long-term and short-term interest rates increased substantially in the wake of the global pandemic. This indicates varying levels of risks in the market, and this is a key predictor of economic recessions as detailed by Estrella and Mishkin (1998).

Fig. 1
figure 1

South African Government Bond Yields

From Fig. 1, it can be seen that the daily yield data is more volatile for longer maturities. Furthermore, the yield data for the 2, 6, 10 and 15 year bonds follow the same pattern. The yields of these bonds have an overall increasing trend. This is due to the increasing yield required by investors as the risk of default increases as the time to maturity decreases. The reason for the increase in yield is further emphasised by the spike in the data for the 2, 6, 10 and 15 year bonds at the start of 2020, which is most likely due to the onset of the COVID-19 pandemic which indicates the uncertainty in the market and a decrease in demand of South African Treasury bonds at that time.

3.1.2 Corporate Bond Data

The corporate data that was used was taken from four of South Africa’s major banks namely African Development Bank (ADB), FirstRand, Nedbank, and Standard Bank. The government bond data was given in terms of zero-coupon yield values, whereas the corporate bond data was given in terms of bond price values. The dataset for the government bond yields is longer than the dataset for individual banks. This is accounted for in the estimation process.

Tables 2 and 3 show corporate bond data summary statistics for the major banks that will be applied in the default risk estimation procedure.

Table 2 Corporate bonds for South African major banks
Table 3 Summary Statistics of Corporate bond yields for South African major banks

3.2 Estimation Method

This section discusses the mathematical methodology that was used in the implementation of the models.

The estimation approach considered is the quasi-maximum likelihood in combination with the extended Kalman filter. The model is cast into a state-space form, which consists of the system equations and the observation equations. The estimation of the defautable bond price in Eq. (2.3) is followed by two steps. The hidden factor is the instantaneous rate in Eqs. (2.7) and (2.38). First we estimate the instantaneous rate \(r_t\) according to default-free bond data. Then taking the first step result as given we estimate the risk-neutral default intensity \(h_t\) based on the observation of corporate bonds. In this case, the hidden factor is the instantaneous intensity in Eqs. (2.8) and (2.39).

3.2.1 Default-Free Model Methodology

The default free model presented by Duffee (1999) uses the assumption of the equivalent martingale measure \({\mathbb {Q}}\) (which may not be unique) where bonds are priced according to the default-free interest rate \(r_t\). Duffee (1999) considers the assumption of a two-factor model.Footnote 10 In this paper a three-factor model is considered, whereby \(r_t\) is equal to the sum of the state variables as given in Eq. (2.36).

There are four maturities, \(T_1=2\) years, \(T_2=6\) years, \(T_3=10\) years, and \(T_4=15\) years. The observed bond yield \(Y(t,T_j)\) of corresponding maturity \(T_j\) is fitted into the formula

$$\begin{aligned} Y(t,T_j)=G_{0}^r(T_j-t)+{\textbf{G}}_{1}^r(T_j-t)^\top \mathbf {y^r}(t)+\epsilon _{jt}, \end{aligned}$$
(3.1)

where \(G_{r0}\) and \(G_{r1}\) are of the form as given in (2.26) and (2.28) and the measurement errors \(\epsilon _{jt}\) are i.i.d. \({\mathcal {N}}(0,\sigma _\epsilon ^2)\)-distributed. The estimation task is to estimate the unobservable instantaneous rate \(r_t\) which is expressed as a sum of factors given in Eq. (2.7) or (2.29) and the parameters in \(G_{r0}\) and \(G_{r1}\) according to the observed yield \(Y(t, T_j)\).

In the filtering case, the parameter set \(\Theta\) includes the observation error variance and the variance of the state variables.

This paper estimates parameters through the method of quasi-maximum likelihood (in conjunction with the extended Kalman filter discussed in the “Appendix A”), i.e., we choose \({\hat{\theta }}\) that maximises the likelihood of all observations \(Y_{t_N}\):

$$\begin{aligned} {\hat{\theta }}=\arg \max _{\theta \in \Theta } L(Y_{t_N},\theta ), \end{aligned}$$
(3.2)

where the likelihood function \(L(Y_{t_N},\theta )\) can be decomposed in terms of sequential conditional likelihood so that

$$\begin{aligned} {\hat{\theta }}=\prod _{k=1}^N l(y_{t_k}|Y_{t_{k-1}},\theta ). \end{aligned}$$
(3.3)

The parameter estimates need to be given in terms of the real-world measure. When switching from the risk-neutral world to the real world, the volatility of the model stays constant while the drift rate does not. To connect pricing under the risk-neutral measure to pricing under the real-world measure, Dai and Singleton (2002) incorporates the market price of risk into the model to account for the change in the drift rate (for Proposition 2.2). The market price of risk (defined in Eqs. (2.9) and (2.10)) is thus another parameter to estimate.

Since the risk-free rate is the sum of state variables, the process for the \(y_i\)’s also follows the CIR model and is generated as a square root diffusion process independently for each variable, as stated in Chen and Scott (2003). It is then clear that the state variables can be written recursively. A model for the yield is then determined, such that it is dependent on the state variables. Following that, the model error of the yield can be determined by calculating the difference between the treasury bond yield data (i.e. the market yield) and the model yield. Chen and Scott (2003) make use of the extended Kalman filter in order to estimate the unobservable state variables and the same methodology is applied in this paper as well. The extended Kalman filter is used in order to tackle the situation that the factors cannot be observed directly. Since the log-likelihood function depends on the error in the model yield, the parameters for the state variables (\(\theta _i^r\), \(\kappa _i^r\), \(\sigma _i^r\), \(\lambda _i^r\)) are found such that the (negative) log-likelihood is a minimum. In this way, the parameters are essentially estimated through a quasi-maximum likelihood (QML) technique since the noise process is conditionally Gaussian.Footnote 11

3.2.2 Defaultable Model Methodology

This section is the main purpose of the paper which is to estimate the default risk or intensity parameter, \({\hat{h}}_t\).Footnote 12 This estimation procedure uses the short rate, \(r_t\), as estimated in the default free estimation procedure discussed in Sect. 3.2.1.

The process followed by the default intensity parameter, \({\hat{h}}_t\) is also modeled by the multi-factor CIR and \(\alpha\)-CIR model given in Eqs. (2.8) and (2.39) respectively.

Observed are corporate bond prices of a major bank with different maturity dates \(T_j\) and coupon \(c_j\) that identifies the \(j{th}\) bond. For example, for the CIR model the measurement equation, based on Eq. (2.3) becomes

$$\begin{aligned} P^d(t,T_j,c_j)\&=\exp \left\{ \sum _{i=1}^{3}(\rho _i{\bar{r}}_i(T_j-t)+\Phi _i^r(t,T_j,(1+\rho _i),0)+y_i^r(t)\Psi _i^r(t,T_j,(1+\rho _i),0))\right\} \end{aligned}$$
(3.4)
$$\begin{aligned}&\quad \times \exp \left\{ \sum _{i=1}^{3}(\Phi _i^h(t,T_j,1,0)+y_i^h(t)\Psi _i^h(t,T_j,1,0))\right\} +\epsilon _{jt}, \end{aligned}$$
(3.5)

\(\epsilon _{jt}\sim {\mathcal {N}}(0,\sigma _\epsilon ^2)\).

For all \(i=1,2,3\) \(y_i^r\) is treated as exogenously given and \(\Phi _i^r\) and \(\Psi _i^r\) are evaluated at \(\kappa _i^r, \theta _i^r, \sigma _i^r\), and \(\lambda _i^r\) already estimated in the first step of default-free estimation. The maximum likelihood method is applied in order to estimate \(y_i^h, \kappa _i^h, \rho _i, \theta _i^h, \sigma _i^h, \lambda _i^h\) and the variance of the measurement errors \(\sigma _\epsilon\). The same procedure applies to the \(\alpha\)-CIR model.

As done in the default-free Sect. 3.2.1 where the state variables were estimated from the recursive pattern, the default intensity parameter in this section is estimated through a recursive pattern. The model bond price is subtracted from the observed market corporate bond to estimate the model error. An extended Kalman filter procedure was implemented to specify the method used in estimating the default intensity parameters that minimize the model errors. From the estimated parameters, the default intensity was thus calculated.

3.3 Empirical Results

This section begins with the discussion of the default-free process estimation. We proceed to discuss the estimation of default risk from defaultable corporate bonds. Lastly, a comparison is then drawn between the two models, making specific inference to performance of models before COVID-19 to that during the pandemic.

3.3.1 Default-Free Parameter Estimation Result

The implementation of the default-free estimation was done to find the optimal parameter values that result in the minimum of the negative log-likelihood with the extended Kalman filtering approach (see Chen and Scott (2003) and Chen and Scott (1993) for more details on Extended Kalman fitering method or see “Appendix A” for a brief summary of Kalman filtering). A global optimiser, Adaptive Simulated Annealing (ASA) introduced by Ingber (1993, 1996) is used for the optimisation task.Footnote 13 Default-free parameter estimates are given in Table 4. We measure the performance based on the log-likelihood value, the bigger the absolute value of the log-likelihood function, the better the model fits the data. For example, the minimum of the negative log-likelihood value for the default-free for 3-factor CIR model is \(-7.47E+04\) and \(-8.48E+03\) before and during COVID-19 respectively while the negative log-likelihood value for the default-free for 3-factor \(\alpha\)-CIR model is \(-7.99E+04\) and \(-8.87E+03\) before and during COVID-19 respectively. This indicates that the \(\alpha\)-CIR model provides a better fit to the data. We can also see that both models perform poorly during the COVID-19 period. The estimates of the parameters are given in Table 4 below, rounded to the nearest 3 decimals. The values in parentheses are the standard errors of the estimates.Footnote 14 Furthermore, looking at the small values for the standard errors, we are confident about the accuracy of the default-free parameter estimates for both models.

Table 4 Parameter estimates of treasury bond yields

In Table 5, we show the model fit across yield maturities. The \(\alpha\)-CIR model consistently outperforms the CIR model across all maturities and has a better RMSE notably during the COVID-19 period.Footnote 15 Figure 2 shows the state variables from both models that were estimated using the optimal parameters given in Table 4. The state variables were estimated for the period of October 2015–May 2021. The estimated unobservable or state variables mimic the trajectories of the yield curve very well before and during COVID-19.

Table 5 RMSE of yield to Maturity
Fig. 2
figure 2

Comparison of the bond yield with estimated state variables (full data set) from each model

As seen from Fig. 2, the filtered state variables from both models have trajectories identical to the government bond yields showing that the estimates of the state variables are accurate and a further indication that the default-free algorithm performs well on the full data set as well as at subperiods. It is also important to note that owing to the three-factor nature, the state variables have additional flexibility in which to capture the bond yields.

In the next section, we give the estimates of the defaultable bond process.

3.3.2 Default Risk Parameters Estimation Result

The parameters estimated in the default-free algorithm are kept constant. The default risk parameter estimates for 3-factor CIR and \(\alpha\)-CIR model are given in Tables 67 and 89 respectively. Looking at CIR model estimates in Tables 6 and 7, estimates for all banks are statistically significant. However, in the case of the \(\alpha\)-CIR model estimates (see in Tables 8, 9) standard errors are quite large (see, for example, the results for the African Development Bank and Nedbank).

The parameter \(\alpha\), determines how closely related the model is to the original CIR model by how close it is to the value of 2. One would argue that state variable 2 as estimated from FirstRand data over the whole period is closer in theory to the original CIR model while state variables 1 and 3 are more likely to capture large jumps in interest rates, as well as continuously low-interest rates in the market. However, \(\alpha\) has little effect if \(\sigma _z\) is zero as is the case here. This fact is further emphasized by Fig. 4 which shows default intensity \(h_t\) to follow the dynamics of log corporate bond prices. Nevertheless, as was discussed in the methodology for the \(\alpha\)-CIR model, the methods in determining the model yields for the CIR and the \(\alpha\)-CIR models are the reason for the large differences in state variables estimated by the two models. It can also be reiterated at this point that the advantages of the \(\alpha\)-CIR model are those of the jump diffusion and its ability to capture low-interest rates. Figure 4 shows how the model has achieved this.

Table 6 3-factor CIR model estimates of Default Intensity Processes I
Table 7 3-factor CIR model Estimates of Default Intensity Processes II
Table 8 3-factor \(\alpha\)-CIR model Estimates of Default Intensity Processes I
Table 9 3-factor \(\alpha\)-CIR model Estimates of Default Intensity Processes II

Figure 4 show the state variables that were estimated with the parameter estimates given in Table 6, 7, 8 and 9. Further comparisons were drawn between the state variables for the whole period and the log bond prices. This was done to highlight how the estimation performed. It is clear that the CIR model was able to follow the same trend as that of the bond data while the \(\alpha\)-CIR strayed relatively far away from the bond yield. Thus, the CIR model outperformed the \(\alpha\)-CIR model in the conditions of the economy prior to COVID-19. This is as was expected as the economy was functioning in an ordinary way and its interest rates prior to COVID-19 were not low.

The comparative graphs in Fig. 4 highlight the inefficiency of the CIR model during the COVID-19 period while depicting how the \(\alpha\)-CIR model follows the trend of the log bond quite well. Thus, the \(\alpha\)-CIR model outperforms the CIR model owing to its property of modeling low interest rates and jump behaviour. The South African governor, Lesetja Kganyago (Kganyago, 2021), explains that South African interest rates are at record lows and this warrants a consistent modelling approach. Therefore our results show that a multi-factor \(\alpha\)-CIR model could be the best candidate for modelling interest rates in times of market uncertainty.

Table 10 depicts the mean squared errors (MSE) and Akaike Information Criterion (AIC) of the models fit to corporate bond data across different subperiods. We observed that the \(\alpha\)-CIR model has the lowest MSE for all banks in the sample and across all subperiods (see also Fig. 3). The \(\alpha\)-CIR model has more parameters than the CIR model, in order to have a meaningful comparison, we give AIC. The preferred model is the one with the lowest AIC. The formula for the AIC is

$$\begin{aligned} \text{ AIC } = n\ln (\text{ MSE})+2p, \end{aligned}$$
(3.6)

where n is the number of observations, and p is the number of free parameters to be estimated. As revealed by the AIC and looking at Fig. 3 we see that \(\alpha\)-CIR consistently outperformed the CIR model for pricing corporate bonds.

Table 10 Corporate bond pricing errors
Fig. 3
figure 3

Pricing performance

Fig. 4
figure 4

Corporate bond estimation default intensity using the multi-factor CIR-based models

The CIR-distributed default intensity for the banks in Fig. 4, resulted in the default intensity factors being either decreasing or constant functions. According to (Crouhy et al., 2001), companies with low credit ratings (below investment grade) have decreasing probabilities of default. It should be noted that default intensity is not interchangeable with the probability of default, however, relationship is monotonic. Figure 4 therefore tells the reader that banks below investment grade bonds are clearing indicated by their decreasing default intensity curves.

The credit ratings of the African Development Bank, FirstRand, Nedbank and Standard bank, respectively, were found to be AAA, BB- and BB-, BB- according to Fitch. Furthermore, S &P rated FirstRand as B and Nedbank as BB- (which is a higher rating than FirstRand). These ratings reinforce the results obtained from the CIR model since the banks with lower credit ratings have steeper downward curvature of their default intensities. African Development Bank with the highest rating has the least downward curvature and indicates a stable default intensity. First Rand with the lowest credit rating according to S &P, has the steepest default intensity. Lastly, Nedbank has a higher rating than FirstRand and a more stable curve.

Moreover, Fig. 4 portrays the results of the three-factor default intensity parameter from the \(\alpha\)-CIR model. It is clear that the two models portray opposite results to one another with the default intensities of FirstRand and Nedbank now being upward-sloping. Upward-sloping default intensities are constant with the standard understanding of the probability of default. That is, default risk as seen from today, increases when looking further into the future.

At first glance, the results from the \(\alpha\)-CIR model are more reliable since they are consistent with standard default risk understanding and since the default free form or this model outperformed the CIR model during the COVID-19 period. Recall that these results are for the period during the pandemic.

Next, we show the probability of a major bank defaulting within the next n-year. As revealed in Table 11 AFDB and FSRSJ are found to have a low probability of default within one year. AFDB is rated much more highly than the other banks and the probability of default are consistently lower than other banks. NEDSJ is found to have the highest probability of default within the next 20 years. Moreover, we are puzzled to see that although FSRSJ has a low probability of default within one-year compared to STABAN, the 20-year probability of default for FSRSJ is higher than that for STABAN. This could be due to the fact that the model is fitted to short-term data (for maturity ranging from 1 to 10 years), and almost all of the values in Table 11 represent extrapolation, which is something that usually models do badly.

Table 11 Probability of default

4 Concluding Remarks

In this paper we began with an understanding of the South African bond yield data that the estimation procedure was implemented. We considered 3-factor CIR and \(\alpha\)-CIR models under completely affine specification. Comparisons were drawn for the periods before COVID-19 as well as during COVID-19 in order to ascertain which model performed better under the different circumstances. The defaultable results were compared during the COVID-19 period for both models. We found that the South African context provided a unique and interesting setting in which to estimate stochastic default intensity. This was owing to the South African bonds being an attractive investment. Their low-rated status means that they are riskier than investment-grade bonds and thus provide a higher yield to investors. Additionally, South African bonds perform better than other below-investment-grade bonds, and thus investments in the bonds are attractive.

The South African economy experienced an economic territory owing to the pandemic. The widespread effect that the pandemic has had on the economy was far-reaching and this ushered us to critically examine the behavour of default intensity specifically at this time.

The main focus of this paper was to estimate the unobservable factors affecting bond yields and to ultimately estimate the default intensity of corporate bonds.

The CIR model has traditionally been a preferred interest rate model since interest rates have commonly been considered positive up until recent years. During the COVID-19 pandemic, South Africa has experienced all-time interest rate lows. As a result of this, an extension to the CIR model, known as the \(\alpha\)-CIR model was chosen to compare its performance to the CIR model. The model is able to capture continuously low-interest rates as well as sizeable jumps that are prevalent during COVID-19 since it is a period of economic turmoil. This paper aimed at assessing the performance of each interest rate model in different time phases to draw conclusions, about which model performs best.

It was found that the CIR model performed well but not as well as that of the \(\alpha\)-CIR model before the pandemic. This is understandable as interest rates prior to COVID-19 were at a moderate level and the economy was relatively stable.

It was subsequently found that the \(\alpha\)-CIR model outperformed the CIR model during the outlined COVID-19 period. This can be attributed to the fact that the \(\alpha\)-CIR model is able to model low-interest rates that, as well as sizeable jumps, which have been prevalent during the pandemic.

The defaultable section was implemented on the data during the COVID-19 period. One would expect that, as with the results in the default-free section, the \(\alpha\)-CIR model would be a better model. However, owing to the inconclusive results attained from the model it was not considered as reliable. The CIR model showed that FirstRand and Nedbank, had sharp decreasing curves. This can be attributed to the credit quality of the banks being that of BBB- and Ba2. This means that initially there is a high probability of default, however, as time passes the probability of default decreases. African Development Bank has a more stable straight curve. This is owing to the fact that the bank has an AAA rating.

The research conducted showed that there was a clear significance in using a three-factor model to estimate the unobservable state variables. The graphs clearly show that an additional state variable enables a better fit to some of the bond yields, as well as a more accurate fit.

The CIR and \(\alpha\)-CIR interest rate models fall under the equilibrium class of models. Research has also been conducted on a similar topic using the Vasicek model for the short rate. However, further research on this topic can be conducted to compare the performance of the equilibrium class models with no-arbitrage models. According to Hull (2015), no-arbitrage models better capture the reality of interest rates since the drift term is a function of time (unlike the equilibrium models). Due to this adjustment, the zero rate in the market at each point in time would be used to calculate the short rate for the future. Using these models, the estimates for the state variables and default intensity parameters would be updated once new market information is made available. Estimates would thus not primarily be based on historical data.

Since the performance difference between the two models considered is not substantial, we may conclude that both are equally well suited to ZAR bond data. Furthermore, this paper sheds light on the dynamics and peculiarities of the South African financial market, which may have practical implications for professionals or policymakers in the context of default risk management.

Lastly, alternative machine learning algorithms could provide a possible enhancement to estimating stochastic default intensity. The extended Kalman filtering algorithm that was implemented in this paper is a form of machine learning. However, alternative machine learning algorithms could provide possible enhancements to parameter estimation, particularly in the field of code efficiency. An added benefit of exploring alternative machine learning algorithms would be that of an algorithm that learns as additional data is incorporated into the model. Accuracy could be improved when applying this method.