Abstract
In the paper portfolio optimization over long run risk sensitive criterion is considered. It is assumed that economic factors which stimulate asset prices are ergodic but non necessarily uniformly ergodic. Solution to suitable Bellman equation using local span contraction with weighted norms is shown. The form of optimal strategy is presented and examples of market models satisfying imposed assumptions are shown.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Many stochastic control methods are used in theoretical studies of portfolio management (cf. Prigent 2007 and references therein). Among them, risk sensitive control is one of the most recognised ones. For infinite time horizon, any portfolio value process V and risk-averse parameter \(\gamma <0\), the Risk sensitive criterion (RSC) function is given by
Using this objective function in portfolio management gives us many advantages over the standard theoretical methods, which are usually based on expected utility criterions. Let us alone mention difficulties associated with the estimation of model parameters or traceable difficulties which arise, when we try to compute optimal trading strategies for the realistic security market models (Bielecki and Pliska 2003). For RSC, applying Taylor expansion around \(\gamma =0\), we get
which shows that this map could be seen as a measure of performance, as it penalise expected growth rate with asymptotic variance multiplied by risk-averse parameter \(\gamma <0\). Of course, this only applies for problems, for which the last term (i.e. \(O(\gamma ^{2},t)/t\)) vanishes, when t goes to infinity. Nevertheless, this assumption is satisfied for a lot of standard dynamics, as explained in Bielecki and Pliska (2003, Section 5), so (2) brings out the motivation, which led to this class of maps. We refer to Bielecki and Pliska (2003) for a further discussion about economic properties of RSC.
Following Bielecki et al. (2015), Gülten and Ruszczyński (2015), we would like to stress out the fact, that RSC could be seen as a risk-to-reward criterion. In fact, RSC could be considered as an Acceptability index (Cherny and Madan 2009; Bielecki et al. 2014), the map quantifying the tradeoff between portfolio growth and the risk associated with it. Many methods from risk and performance measurement theory could be directly applied to RSC, as we will show in this paper.
From another point of view, RSC is a good objective function for many optimal control problems related to (controlled) Markov decision processes both on finite and infinite time horizons (cf. Hernández-Lerma and Lasserre 1996; Hernández-Lerma 1989; Di Masi and Stettner 1999; Cavazos-Cadena and Hernández-Hernández 2005 and references therein). In particular, the connection to portfolio optimization was shown in Bielecki and Pliska (1999), where RSC was applied to continuous time infinite time horizon, and a version of Merton’s intertemporal capital asset pricing model (Merton 1973) was considered. The analogous study for discrete time market model was done in Stettner (1999).
Because of that, we have decided to present our results in such a way, that they might be interesting both for specialists from risk analysis, in particular studying dynamic growth indices, as well as for specialists from risk sensitive control Markov decision processes.
There are many sophisticated methods, which guarantee the existence of the solution to Bellman equation associated with RSC. Let us alone mention the vanishing discount approach (Hernández-Hernández and Marcus 1996) or the fixed point approach (Di Masi and Stettner 1999). The assumptions under which the existence of the solutions is guaranteed are usually related to ergodic properties of the considered process (Di Masi and Stettner 1999; Kontoyiannis and Meyn 2003; Hernández-Lerma 1989; Hernández-Hernández and Marcus 1996). The most recent results relate to localized Doeblin’s conditions (Cavazos-Cadena and Hernández-Hernández 2005) and Markov splitting techniques (Di Masi and Stettner 2006a). The theory of RSC is also strictly connected to multiplicative Poisson equations (Di Masi and Stettner 2006a) and Issacs equations for ergodic cost stochastic dynamic games (cf. Hernández-Hernández and Marcus 1996; Fleming and Hernández-Hernández 1997; Dai Pra et al. 1996 and references therein).
In the paper, we generalize the results of Stettner (1999) in the sense that we consider market model with more general economic factors, which are not necessarily uniformly ergodic, and consequently studying Bellman equation we have to work with suitable weight functions. Such more general economic factors were studied for Black Scholes market in the paper (Bielecki and Pliska 1999) and then continued for continuous time general diffusion models in Nagai (2003). In this paper we are studying discrete time model and we were motivated by attempts to generalize risk neutral results of Hairer and Mattingly (2011) to the risk sensitive portfolio by the paper (Shen et al. 2013).
The main novelty of the paper is that we obtain, using weighted span norm contraction method, the existence of solutions to suitable Bellman equation. Consequently, our paper can be applied to more general dynamics of the market than in Stettner (1999). Furthermore we solve a risk sensitive control problem with unbounded solutions to the Bellman equation.
This paper is organized as follows. Section 2 is the general setup. We state here all assumptions core to our study (e.g. on dynamics, control, etc.). Next, in Sect. 3 we recall some basic notation for the weighted norms and span-norms. In Sect. 4 we present the main results of this paper, i.e. we state the Bellman equation and show when it could be solved. In Sect. 5 we show how to connect Bellman equation to the initial investment problem. In particular we discuss, given a solution to the Bellman equation, how to construct the optimal strategy and when it is possible. Finally, in Sect. 6 we show exemplary dynamics, that could be fit to our model.
2 Preliminaries
Let \((\varOmega ,\mathcal {F},\{\mathcal {F}_{t}\}_{t\in \mathbb {T}},\mathbb {P})\) be a discrete-time filtered probability space, where \(\mathbb {T}=\mathbb {N}\), \(\mathcal {F}_{0}\) is trivial, \(\mathcal {F}=\bigcup _{t\in \mathbb {T}}\mathcal {F}_{t}\,\) and convention \(\mathbb {N}=\{0,1,2,\ldots \}\) is used. Moreover, let \(L^{0}:=L^{0}(\varOmega ,\mathcal {F},\mathbb {P})\) correspond to the space of all (a.s. identified) \(\mathcal {F}\)-measurable random variables, and let \(L^{1}:=L^{1}(\varOmega ,\mathcal {F},\mathbb {P})\).
We will assume that the market consists of m risky assets (e.g. stocks, bonds, derivative securities) and k economical factors (e.g. rates of inflation, short term interest rates, dividend yields). Prices of m risky assets will be denoted by \(S^{i}=(S_{t}^{i})_{t\in \mathbb {T}}\) for (\(i=1,\ldots ,m\)) and levels of k economical factors will be denoted by \(X^{j}=(X_{t}^{j})_{t\in \mathbb {T}}\) for (\(j=1,\ldots ,k\)). For simplicity, we will write \(S:=(S_{t})_{t\in \mathbb {T}}\) and \(X:=(X_{t})_{t\in \mathbb {T}}\), where \(S_{t}=(S_{t}^{1},\ldots ,S_{t}^{m})\) and \(X_{t}=(X_{t}^{1},\ldots ,X_{t}^{k})\).
We will use \(\mathcal {A}\) to denote the set of all U-valued adapted processes, where U is a compact subset of \(\mathbb {R}^{m}\). Elements of \(\mathcal {A}\) will correspond to all admissible portfolio strategies \(H:=(H_{t})_{t\in \mathbb {T}}\), where \(H_{t}=(H_{t}^1,\ldots ,H_{t}^m)\) and \(H^{i}=(H^i_t)_{t\in \mathbb {T}}\) is a part of capital invested in i-th risky asset (for \(i=1,\ldots ,m\)). Furthermore, we will use notation \(V^{H}=(V_{t}^{H})_{t\in \mathbb {T}}\) to denote the portfolio value process corresponding to strategy H.
Throughout this paper we will make the following assumptions:
-
(A.1)
The factor process X is Markov and admits the following representation:
$$\begin{aligned} X_0\in \mathbb {R}^{k},\quad X_{t+1}=G(X_{t},W_{t}):=(G^{1}(X_{t},W_{t}),\ldots , G^{k}(X_{t},W_{t})), \end{aligned}$$where \(G^{i}:\mathbb {R}^{k} \times \mathbb {R}^{k+m}\rightarrow \mathbb {R}^{k}\) is a Borel measurable function, continuous with respect to the first variable (for \(i=1,\ldots ,k\)), and the sequence \((W_t)_{t\in \mathbb {T}}\) is i.i.d. taking values in \(\mathbb {R}^{k+m}\), such that for \({t\in \mathbb {T}}\) random variable \(W_t\) is independent of \(\mathcal {F}_{t}\) and adapted to \(\mathcal {F}_{t+1}\).
-
(A.2)
For any \(H\in \mathcal {A}\), the portfolio dynamics is of the form
$$\begin{aligned} V^{H}_{0}=V_{0},\quad \quad \ln \frac{V^{H}_{t+1}}{V^{H}_{t}}=F(X_t,H_t,W_t), \end{aligned}$$(3)for \(t\in \mathbb {T}\), where \(V_{0}>0\) and \(F:\mathbb {R}^k\times U\times \mathbb {R}^{k+m}\rightarrow \mathbb {R}\) is a Borel measurable function, continuous with respect to the first two variables.
-
(A.3)
For any \(w\in \mathbb {R}^{k+m}\), \(x\in \mathbb {R}^{k}\), \(h\in U\) we have
$$\begin{aligned} \omega (G(x,w))&\le a_{1}(w)+b_{1}\omega (x), \end{aligned}$$(4)$$\begin{aligned} |F(x,h,w)|&\le a_{2}(w)+b_{2}\omega (x) , \end{aligned}$$(5)for Borel measurable functions \(a_1,a_2:\mathbb {R}^{k+m}\rightarrow \mathbb {R}_{+}\), constants \(b_1\in (0,1)\), \(b_2>0\) and continuous measurable function \(\omega :\mathbb {R}^{k}\rightarrow [0,\infty )\), which we shall refer to as the weight function. Moreover, for any \(\gamma \in \mathbb {R}\),
$$\begin{aligned} \mu ^{\gamma }(a_1(W_{0}))\in \mathbb {R}\quad {\text {and}}\quad \mu ^{\gamma }(a_2(W_{0}))\in \mathbb {R}, \end{aligned}$$(6)where \(\mu ^{\gamma }:L^{0}\rightarrow \bar{\mathbb {R}}\) is the entropic utility measure, i.e.
$$\begin{aligned} \mu ^{\gamma }(X):=\left\{ \begin{array}{ll} \frac{1}{\gamma }\ln \mathbb {E}[\exp (\gamma X)] &{}\quad {\text {if } }\gamma \ne 0,\\ \mathbb {E}[X] &{} \quad {\text {if }} \gamma =0. \end{array}\right. \end{aligned}$$(7) -
(A.4)
For any \(R>0\), there exists a constant \(c>0\) and probability measure \(\nu \), such that
$$\begin{aligned} \inf _{x\in C_{R}}\mathbb {P}[G(x,W_0)\in A]\ge c\nu (A),\quad A\in \mathcal {B}(\mathbb {R}^{k}), \end{aligned}$$(8)where \(C_{R}=\{x\in \mathbb {R}^{k}{:}\, \omega (x)\le R\}\).
Assumption (A.1) relates to classic conditions imposed on the factor process.
Assumption (A.2) is technical. It allows to model portfolios through log-returns, rather than value processes (see e.g. Example 1 or Stettner (1999) for more details).
Assumption (A.3) has a financial interpretation. The state-space constraints \(b_1\) and \(b_2\) introduced in (4) and (5) say that in our model we allow only \(\omega \)-growth (i.e. growth proportional to the growth of \(\omega \)) with respect to the state space. In particular, inequality (4) might be seen as a form of the geometric drift condition imposed on X (cf. Hairer and Mattingly 2011). On the other hand, assumption (6) allow us to have control over the entropy of the noise part. In a more probabilistic setting, it is equivalent to the statement that the moment generating functions for \(a_1(W_{0})\) and \(a_2(W_{0})\) exist. In particular, we might say that the utility (or risk) of a single period log-return at time t measured by \(\mu ^{\gamma }\) (or \(-\mu ^{\gamma }\)) must be finite for any simple trade (in any fixed state) and in fact it is bounded by \(\pm a_2(W_{t})\) plus some constant (dependant on the state). Please note, that this assumption is rather weak, and fulfilled by standard models, which describe log-returns as processes of the form
where \(W_{t}=(W_{t}^{1},\ldots ,W_{t}^{k+m})\) is a random vector with multidimensional normal distribution and functions a and b satisfy \(\omega \)-growth constraints. Then, the function \(a_2\) could be constructed using random variables \(\min (W_{t}^{1},\ldots ,W_{t}^{k+m})\) and \(\max (W_{t}^{1},\ldots ,W_{t}^{k+m})\).
Assumption (A.4) is a (local) minorization property. Combined with the geometric drift condition, it allow us to exploit the ergodic properties of X (cf. Hairer and Mattingly 2011). Please note that setting \(\omega \equiv 0\), for any \(R>0\) we get \(C=\mathbb {R}^{k}\). Consequently, in this particular case, (A.4) becomes a global Doeblin’s condition, which is equivalent to the uniform ergodicity of process X. On the other hand, if \(\omega \) is unbounded and \(C_{R}\) is compact for any \(R>0\), then (8) is directly linked to the (local) mixing condition, i.e. the statement that for any fixed compact subset K (of \(\mathbb {R}^{k}\)), we get
The main goal of this paper is to optimize the risk sensitive cost criterion \(\varphi ^{\gamma }\) given by (1), i.e.
where \(\gamma <0\) is a fixed risk aversion parameter and V is the portfolio value process. In other words, given the set \(\mathcal {A}\) and dynamics of \(V^{H}\) for any \(H\in \mathcal {A}\), we want to solve the optimal stochastic control problem
Using the entropic representation of \(\varphi ^{\gamma }\) (see Bielecki et al. 2015 for more details) and (3), for any \(H\in \mathcal {A}\), we get
where \(\mu ^{\gamma }\) is entropic utility measure given by (7). Note that the first equality in (11) provides another financial interpretation of the RSC. The logarithmic transform of \(V_{t}^{H}\) allow us to measure the cumulative growth (log-return) at time t, while the map \(\mu ^{\gamma }\) is used to evaluate its (entropic) utility. Then, the outcome is divided by t to normalise it in time and \(\liminf \) is used to measure (a worst case robust version of) the long-time efficiency of the value process (cf. Bielecki et al. 2015).
Under the above assumptions, from (11), it is not difficult to see, that the optimal value of the problem (10) will be finite, which is in fact the statement of Proposition 1.
Proposition 1
Let \(\gamma <0\). Under assumptions (A.1)–(A.3), we get
Proof
Using (A.2) and (A.3), for any \(H\in \mathcal {A}\) and \(t\in \mathbb {T}\), we get
As the entropic utility measure \(\mu ^{\gamma }\) is monotone, translation invariant, additive for any two independent random variables and law invariant (Kupper and Schachermayer 2009), for any \(t\in \mathbb {T}\), we get
Consequently, using (11) and (6), for any \(H\in \mathcal {A}\), we get
The proof of the other inequality is analogous. \(\square \)
3 Weighted norms
In assumption (A.3) we have introduced measurable and continuous function \(\omega :\mathbb {R}^{k}\rightarrow [0,\infty )\), which we referred to as the weight function. Following Hairer and Mattingly (2011) let us now recall basic notation regarding those function. We shall denote by \(\mathcal {C}_{\omega }(\mathbb {R}^{k})\) the set of all continuous and measurable functions \(f:\mathbb {R}^{k}\rightarrow \mathbb {R}\), such that the \(\omega \)-norm of f is bounded, i.e.
Next, we define \(\omega \)-span seminorm of \(f\in \mathcal {C}_{\omega }(\mathbb {R}^{k})\) by
Remark 1
The classic span-norm of function \(f:\mathbb {R}^{k}\rightarrow \mathbb {R}\) (cf. Hernández-Lerma and Lasserre 1996 and references therein) is usually defined as \(\Vert f \Vert _{{\text {span}}}=\sup _{x}f(x)-\inf _{y}f(y)\). Note that in our framework, using \(\omega \equiv 0\), we get \(\Vert f\Vert _{\omega {\text {-span}}}=\frac{\sup _{x}f(x)-\inf _{x}f(x)}{2}=\frac{1}{2}\Vert f\Vert _{{\text {span}}}\). Moreover, for any bounded weight function \(\omega \), we know that \(\Vert \cdot \Vert _{{\text {span}}}\) and \(\Vert \cdot \Vert _{\omega {\text {-span}}}\) are equivalent.
For any \(\beta >0\) we shall also define the weighted (semi)norms given by
Please note that for any \(\beta >0\) and \(c\ge 0\), the function \(\omega ':\mathbb {R}^{k}\rightarrow [0,\infty )\), given by \(\omega '(x)=\beta \omega (x)+c\) is also a weight function. Let us now recall some basic properties of weighted norms and related span norms.
Proposition 2
Let \(\omega :\mathbb {R}^{k}\rightarrow [0,\infty )\) be a weight function. Then
-
1)
For any \(\beta >0\), the norms \(\Vert \cdot \Vert _{\omega }\) and \(\Vert \cdot \Vert _{\beta ,\omega }\) are equivalent.
-
2)
For any \(\beta >0\), the seminorms \(\Vert \cdot \Vert _{\omega {\text {-span}}}\) and \(\Vert \cdot \Vert _{\beta ,\omega {\text {-span}}}\) are equivalent.
-
3)
For any \(0<\beta <1\) and \(f\in \mathcal {C}_{\omega }(\mathbb {R}^{k})\), we get \(\Vert f\Vert _{\omega {\text {-span}}}\le \Vert f\Vert _{\beta ,\omega {\text {-span}}}\).
-
4)
For any \(f\in \mathcal {C}_{\omega }(\mathbb {R}^{k})\) we get \(\inf _{c\in \mathbb {R}}\Vert f+c\Vert _{\omega }=\Vert f\Vert _{\omega {\text {-span}}}\).
-
5)
Let \(f\in \mathcal {C}_{\omega }(\mathbb {R}^{k})\) and \(c\in \mathbb {R}\). Then \(\Vert f+c\Vert _{\omega }=\Vert f\Vert _{\omega {\text {-span}}}\) if and only if \(c\in [c_1,c_2]\), where
$$\begin{aligned} c_1= & {} -\inf _{x\in \mathbb {R}^{k}} \left\{ f(x)+(1+\omega (x))\Vert f\Vert _{\omega {\text {-span}}}\right\} , \end{aligned}$$(12)$$\begin{aligned} c_2= & {} -\sup _{x\in \mathbb {R}^{k}} \left\{ f(x)-(1+\omega (x))\Vert f\Vert _{\omega {\text {-span}}}\right\} . \end{aligned}$$(13)Moreover, there exists \(c_0\in \{c_1,c_2\}\), such that
$$\begin{aligned} \Vert f+c_0\Vert _{\omega }=\sup _{x\in \mathbb {R}^{k}}\frac{f(x) +c_0}{1+\omega (x)}=-\inf _{x\in \mathbb {R}^{k}}\frac{f(x)+c_0}{1+\omega (x)}. \end{aligned}$$(14)
Proof
The proof of properties 1), 2) and 3) is straightforward and hence omitted here.
4) The proof is based on Hairer and Mattingly (2011, Lemma 2.1) and is recalled for completeness. Let \(f\in \mathcal {C}_{\omega }(\mathbb {R}^{k})\).
For any \(x\in \mathbb {R}^{k}\), we get \(|f(x)|\le \Vert f\Vert _{\omega }(1+\omega (x))\), which in turn implies
Consequently, for any \(c\in \mathbb {R}\) we get
Let us now prove the other inequality. Noting, that we could take \(a\cdot f\) instead of f, for some \(a>0\) and the proof for the case \(\Vert f\Vert _{\omega {\text {-span}}}=0\) is trivial, without loss of generality we could assume that \(\Vert f\Vert _{\omega {\text {-span}}}=1\). By the definition of \(\Vert \cdot \Vert _{\omega {\text {-span}}}\) and the fact that \(\Vert f\Vert _{\omega {\text {-span}}}=1\), we get
for any \(x,y \in \mathbb {R}^{k}\). Thus, \(c_1:=-\inf _{y\in \mathbb {R}^{k}} \left\{ f(y)+1+\omega (y)\right\} \in \mathbb {R}\) and for any \(x\in \mathbb {R}^k\), we get
On the other hand, for any \(x\in \mathbb {R}^{k}\), we get
Combining (16) and (17), we get \(\Vert f+c_1\Vert _{\omega }\le 1\). This, together with (15), concludes the proof of 4).
5) Let \(f\in \mathcal {C}_{\omega }(\mathbb {R}^{k})\) and let \(c\in \mathbb {R}\). Repeating and slightly modifying the proof of 4) it is easy to check that
If \(c\in [c_1,c_2]\), then there exists \(\alpha \in [0,1]\) such that \(c=\alpha c_1+(1-\alpha ) c_2\). Thus, using (15) and (18), we get
On the other hand, we know that if \(\Vert f+c\Vert _{\omega }=\Vert f\Vert _{\omega {\text {-span}}}\), then for any \(x\in \mathbb {R}^{k}\) we get
Because of that, for any \(x\in \mathbb {R}^{k}\) we have
and consequently \(c_1\le c\le c_2\). This completes the first part of the proof. Let us now show that there exists (at least one) \(c_0\in [c_1,c_2]\), satisfying (14).
Given \(f\in \mathcal {C}_{\omega }(\mathbb {R}^{k})\), for any \(c\in \mathbb {R}\) we define
It is easy to note that \(a_+(\cdot )\) is finite, continuous and non-decreasing, while \(a_-(\cdot )\) is finite, continuous and non-increasing. Moreover \(a_+(c)\rightarrow \infty \), as \(c\rightarrow \infty \), and \(a_-(c)\rightarrow \infty \), as \(c\rightarrow -\infty \). Thus, there exists \(c_0\in \mathbb {R}\), such that \(a_{+}(c_0)=a_{-}(c_0)\). Moreover, for any \(c\ge c_0\) we get
while for \(c\le c_0\) w get
Consequently,
By the first part of the proof of 5), we know that \(c_0\in [c_1,c_2]\). If \(c_0\) is equal to \(c_1\) or \(c_2\), then the proof is finished. On the contrary, let us assume that \(c_0\not \in \{c_1,c_2\}\). By using monotonicity of \(a_{+}(\cdot )\) we have \(a_{+}(c_0)\le a_{+}(c_2)\) and by (19) using
we obtain \( a_{+}(c_2)= a_{+}(c_0)\). Consequently \(a_+(\cdot )\) must be constant on \([c_0,c_2]\) and as a convex nondecreasing mapping it is in fact constant on \((-\infty ,c_2]\). Using similar arguments, we get that \(a_-(\cdot )\) as a nonincreasing convex mapping must be constant on \([c_1,\infty ]\). Consequently, both \(c_1\) and \(c_2\) satisfy (14), which concludes the proof. \(\square \)
Remark 2
We might get \(c_1\ne c_2\). Let \(f(x)=0\) for \(|x|\le 1\), and \(f(x)=|x-{1\over x}|\) for \(|x|\ge 1\). Then, for \(\omega (x)=|x|\), it is easy to check that \(\Vert f\Vert _{\omega {\text {-span}}}=1\), \(c_1=-1\) and \(c_2=1\).
Remark 3
One might look at \(c_0\) as a centering constant for weighted f, i.e. the constant, such that the distance from 0 to \(\sup _{x\in \mathbb {R}^{k}}\frac{f(x)+c_0}{1+\omega (x)}\) is the same as the distance from 0 to \(\inf _{x\in \mathbb {R}^{k}}\frac{f(x)+c_0}{1+\omega (x)}\). In particular, the \(\Vert \cdot \Vert _{\omega {\text {-span}}}\) seminorm might be considered as a \(\Vert \cdot \Vert _{\omega }\) norm for centered function, which provide some insight for 4) in Proposition 2.
Proposition 2 implies that for any \(\beta >0\), \(c\ge 0\), \(f:\mathbb {R}^{k}\rightarrow \mathbb {R}\) and \(\omega '\) defined by \(\omega '(x)=\beta \omega (x)+c\), we get
which in turn implies
Moreover, if a family of functions is uniformly bounded wrt. \(\omega \)-span norm, then it is uniformly bounded wrt. \(\omega '\)-span norm.
Next, for any \(\beta >0\), two probability measures \(\mathbb {Q}_{1}\) and \(\mathbb {Q}_{2}\) on \((\mathbb {R}^{k},\mathcal {B}(\mathbb {R}^{k}))\) and the corresponding signed measure \(\mathbb {H}=\mathbb {Q}_{1}-\mathbb {Q}_{2}\), let \(\Vert \mathbb {H}\Vert _{\beta ,\omega {\text {-var}}}\) denote its weighted total variation norm given by
where \(|\mathbb {H}|\) denote the total variation of \(\mathbb {H}\), i.e.
for A being a positive set for measure \(\mathbb {H}\) (obtained e.g. using Hahn-Jordan decomposition). In particular (for \(\omega \equiv 0\)), let \(\Vert \mathbb {H}\Vert _{{\text {var}}}\) denote the the standard total variation norm (Hernández-Lerma and Lasserre 1996), i.e.
4 Bellman equation
Using representation (11), it is not hard to see that the Bellman equation corresponding to (10) is of the form
where \(\lambda \in \mathbb {R}\), \(v\in \mathcal {C}_{\omega }(\mathbb {R}^{k})\), \(x\in \mathbb {R}^{k}\) and \(\omega :\mathbb {R}^{k}\rightarrow [0,\infty )\) is a weight function from (A.3), for which the corresponding Bellman operator
satisfies certain contraction properties.
For computational convenience, let us introduce the associated Bellman equation
where \(u(x)=\gamma v(x)\) and where the corresponding Bellman operator takes the form
Remark 4
Bellman equation (23) is strictly connected to the Multiplicative Poisson Equation (MPE) defined for corresponding \(\gamma \) (cf. Di Masi and Stettner 2006a and references therein). Sufficient general conditions for which there exists a solution to MPE in the classic case (i.e. using ergodicity conditions and span norm or vanishing discount approach) could be found e.g. in Di Masi and Stettner (1999), Kontoyiannis and Meyn (2003), Hernández-Lerma (1989), Hernández-Hernández and Marcus (1996). For a more general conditions (obtained using splitting Markov techniques or Doeblin’s condition) (see e.g. Di Masi and Stettner 2006a; Cavazos-Cadena and Hernández-Hernández 2005). Also using robust representation of the risk measure (i.e. \(-\mu ^{\gamma }\)) (Föllmer and Schied 2002), one could notice that equation (21) corresponds to the Isaacs equation for ergodic cost stochastic dynamic game (cf. Hernández-Hernández and Marcus 1996; Fleming and Hernández-Hernández 1997 and references therein).
Proposition 3
Let \(\gamma <0\). Under assumptions (A.1)–(A.3), the operators \(R_{\gamma }\) and \(T_{\gamma }\) transforms the set \(\mathcal {C}_{\omega }(\mathbb {R}^{k})\) into itself and for \(f\in \mathcal {C}_{\omega }(\mathbb {R}^{k})\) the mapping \((-\infty ,0)\times \mathbb {R}^{k} \ni (\gamma ,x) \mapsto T_\gamma f(x)\) is continuous.
Proof
We will only show the proof for \(R_{\gamma }\), as the proof for \(T_{\gamma }\) is analogous. Let \(f\in \mathcal {C}_{\omega }(\mathbb {R}^{k})\) and \(\gamma <0\). We know that there exists \(M>1\), such that for all \(x\in \mathbb {R}^{k}\), we get \(|f(x)|\le M(\omega (x)+1)\).
First, let us prove that \(\Vert R_{\gamma }f\Vert _{\omega }\) is finite. Using the fact that \(\mu ^{\gamma }\) is monotone and translation invariant as well as (A.3), for any \(x\in \mathbb {R}^{k}\), we get
as well as
Consequently, noting that \(R_{\gamma }f\in \mathcal {C}_{\omega '}(\mathbb {R}^{k})\) for
and using (20), we conclude that \(\Vert R_{\gamma }f\Vert _{ \omega }\) is finite.
Second, let us prove that the mapping \((-\infty ,0)\times \mathbb {R}^{k} \ni (\gamma ,x) \mapsto R_{\gamma }f(x)\) is continuous. Let \(\{(\gamma _n,x_{n},h_n)\}_{n\in \mathbb {N}}\) be a sequence such that \(\gamma _n<0\) \(x_n\in \mathbb {R}^{k}\), \(h_{n}\in U\) and \((\gamma _n, x_{n},h_n)\rightarrow (\gamma , x,h)\), where \(\gamma <0\), \(x\in \mathbb {R}^{k}\) and \(h\in U\). By (A.1) and (A.2) we know that
As the weight function \(\omega \) is continuous and finite-valued, we know that
Moreover, using (A.3), we get
with \(\gamma _0\) such that for any n we have \(\gamma _n \le \gamma _0\). Noting that
by dominated convergence theorem,
and consequently
Let \(h_{z}^\gamma := {{\mathrm{arg\,max}}}_{h\in U} \mu ^{\gamma }(F(z,h,W_0)+f(G(z,W_0)))\), for any \(z\in U\) (note that U is compact). Due to continuity of the function
we also know that
which imply continuity of \((\gamma ,x) \rightarrow R_{\gamma }f(x)\). \(\square \)
We are now ready to formulate the main result of this paper.
Theorem 1
Let \(\gamma <0\). Under assumptions (A.1)–(A.4), for sufficiently small \(\beta >0\), the operator \(T_{\gamma }\) is a local contraction under \(\Vert \cdot \Vert _{\beta ,\omega {\text {-span}}}\), i.e. there exist functions \(\beta : \mathbb {R}_{+}\rightarrow (0,1)\) and \(L: \mathbb {R}_{+} \rightarrow (0,1)\) such that
for \(f_1,f_2\in \mathcal {C}_{\omega }(\mathbb {R}^{k})\), such that \(\Vert f_1\Vert _{\omega {\text {-span}}}\le M\) and \(\Vert f_2 \Vert _{\omega {\text {-span}}}\le M\).
The proof of Theorem 1 will be split into three lemmas which we will now formulate and prove. Before we do this, let us introduce some helpful notation.
Let \((\varOmega ,\mathcal {F}_{1},\mathbb {P}_{1})\) be a probability space which corresponds to random variable \(W_0\). For any \(f\in \mathcal {C}_{\omega }(\mathbb {R}^{k})\), \(x\in \mathbb {R}^{k}\) and \(h\in U\) we will use the following notation
where \(\mathcal {M}_{1}:=\mathcal {M}_{1}(\varOmega ,\mathcal {F}_{1})\) denote the set of all probability measures on \((\varOmega ,\mathcal {F}_{1})\), \(H[\mathbb {Q}\Vert \mathbb {P}_{1}]\) is the relative entropy of \(\mathbb {Q}\) wrt. \(\mathbb {P}_{1}\), i.e.
and the convention \(\infty -\infty =-\infty \) is used. Objects defined in (25) and (26) might be non-unique in the sense that \({{\mathrm{arg\,min}}}\) (or \({{\mathrm{arg\,max}}}\)) might define a set, rather than a single element. Nevertheless, with slight abuse of notation, we take any fixed maximizer of (25) and assume that \(h_{x,f}\in U\). To have a unique representation of measure \(\mathbb {Q}_{(x,f,h)}\), we use so called Esscher transformation (Gerber 1979). Before we write the explicit form of \(\mathbb {Q}_{(x,f,h)}\), let us give a more specific comment. The measure \(\mathbb {Q}_{(x,f,h)}\) corresponds to the minimizing scenario in the robust (dual) representation of the entropic utility \(\mu ^{\gamma }\). Indeed (see e.g. Dai Pra et al. 1996), for any \(Z\in L^{0}(\varOmega ,\mathcal {F}_{1},\mathbb {P}_{1})\), such that \(\gamma Ze^{\gamma Z}\in L^{1}(\varOmega ,\mathcal {F}_{1},\mathbb {P}_{1})\), we get
To show that
is such that \(\gamma Ze^{\gamma Z}\in L^{1}(\varOmega ,\mathcal {F}_{1},\mathbb {P}_{1})\), it is enough to note that \(\Vert f\Vert _{\omega }<\infty \) and use (A.3). Then, we get
which combined with the fact that for any \(\gamma <0\) we get
concludes the proof. Then, as shown in Dai Pra et al. (1996, Proposition 2.3), we could define the minimizer of (26) through Esscher transformation of Z, i.e. the measure \(\mathbb {Q}_{(x,f,h)}\) given by
We will also define the measure \(\bar{\mathbb {Q}}_{(x,f,h)}\) on \(\mathbb {R}^{k}\), by
Finally, for any \(f,g\in \mathcal {C}_{\omega }(\mathbb {R}^{k})\) and \(x,y\in \mathbb {R}^{k}\) we shall write
We are now ready to introduce Lemma 1, Lemma 2 and Lemma 3.
Lemma 1
Let \(\gamma <0\). Under assumptions (A.1)–(A.3), we get
for any \(f,g\in \mathcal {C}_{\omega }(\mathbb {R}^{k})\), \(x,y\in \mathbb {R}^{k}\) and \(\beta >0\).
Proof
Let \(f,g\in \mathcal {C}_{\omega }(\mathbb {R}^{k})\), \(x,y\in \mathbb {R}^{k}\) and let \(\beta >0\). Using (25) we get
Now, using (26) we get
Combining (32) and (33) we get
Switching f with g in (34), and doing similar computations for \(y\in \mathbb {R}^{k}\), we get
Combining (34) with (35) and recalling notation (30), we get
We know that for any \(c\in \mathbb {R}\), we get
Let \(A\subset \mathbb {R}^{k}\) denote a positive set for a signed measure \(\mathbb {H}^{f,g}_{x,y}\) (obtained e.g. using Hahn-Jordan decomposition) and for any \(c\in \mathbb {R}\) let
Then, for any \(c\in \mathbb {R}\), we get
From Proposition 2 we know that there exists \(c_0\in \mathbb {R}\), such that
Thus, from (37) we get
which together with (36) concludes the proof of (31). \(\square \)
Lemma 2
Let \(\gamma <0\). Under assumptions (A.1)–(A.3), for any fixed \(M>0\) and \(\phi \in (b_1,1)\), there exists \(\alpha _{\phi }>0\), such that
for any \(x,y\in \mathbb {R}^{k}\) and \(f,g\in \mathcal {C}_{\omega }(\mathbb {R}^{k})\) satisfying \(\Vert f\Vert _{\omega {\text {-span}}}\le M\) and \(\Vert g\Vert _{\omega {\text {-span}}}\le M\).
Proof
For any \(x,y\in \mathbb {R}^{k}\) and \(f,g\in \mathcal {C}_{\omega }(\mathbb {R}^{k})\) we get
Thus, to prove (39) it is sufficient to show that for any fixed \(M>0\) and \(\phi \in (b_1,1)\), there exists \(\alpha _{\phi }>0\), such that
for any \(h\in U\), \(x\in \mathbb {R}^{k}\) and \(f\in \mathcal {C}_{\omega }(\mathbb {R}^{k})\) satisfying \(\Vert f\Vert _{\omega {\text {-span}}}\le M\).
Let \(M>0\) and \(\phi \in (b_1,1)\). Using (28) and (29) we get that (40) is equivalent to
For simplicity let \(Z:= \gamma F(x,h,W_0)+f(G(x,W_0))\). It is enough to prove that
where \(A=\{\omega (G(x,W_0))-\phi \omega (x)>\frac{\alpha _{\phi }}{2}\}\), as the inequality
is trivial. Using Schwarz inequality we get \(1\le \mathbb {E}[e^{-Z}]\mathbb {E}[e^{Z}]\), so it is enough to show that
Multiplying both sides of (41) by \(\frac{2(Mb_1-\gamma b_2)}{(\phi -b_1)}\), using the fact that \(y<e^{y}\) for any \(y>0\), and inequality \(\frac{2Mb_1}{(\phi -b_1)}<\frac{2(Mb_1-\gamma b_2)}{(\phi -b_1)}\), to prove (40), it is sufficient to show that
Using (A.3) and Schwarz inequality we get
so instead of (42) it is enough to show that
Let us prove (43). Due to (A.3) we know that
On the other hand, from the fact that \(\Vert f\Vert _{\omega {\text {-span}}}\le M\), we know that there exists \(a\in \mathbb {R}\) such that \(\Vert f+a \Vert _{\omega }\le M\). Consequently, recalling that \(Z=\gamma F(x,h,W_0)+f(G(x,W_0))\), using monotonicity of the exponent function and (A.3), we get
Using (45), (46) and (6) we get
Combining (47) and (44), we get that (43) will hold for \(\alpha _{\phi }\) large enough. In other words it is enough to choose \(\alpha _{\phi }\), such that
This concludes the proof of (40). \(\square \)
Lemma 3
Let \(\gamma <0\). Under assumptions (A.1)–(A.4), for any fixed \(M>0\), \(\phi \in (b_1,1)\) and \(\alpha _{\phi }>0\), there exists \(\beta \in (0,1)\) and \(L\in (0,1)\) such that
for any \(x,y\in \mathbb {R}^{k}\) and \(f,g\in \mathcal {C}_{\omega }(\mathbb {R}^{k})\) satisfying \(\Vert f\Vert _{\omega {\text {-span}}}\le M\) and \(\Vert g \Vert _{\omega {\text {-span}}}\le M\).
Proof
Let us fix \(M>0\), \(\phi \in (b_1,1)\) and \(\alpha _{\phi }>0\). Let \(R\in \mathbb {R}\) be such that
We will consider two cases:
and find \(\beta <1\) and \(L\in (0,1)\) such that (49) is satisfied both on \(\{\omega (x)+\omega (y)>R\}\) and \(\{\omega (x)+\omega (y)\le R\}\).
- Case a):
-
Noting that \(\Vert \mathbb {H}^{f,g}_{x,y}\Vert _{\text {var}}\le 2\), it is enough to find \(\beta <1\) and \(L\in (0,1)\) such that
$$\begin{aligned} 2+\beta (\phi \omega (x)+\phi \omega (y)+2\alpha _{\phi })\le L(2+\beta \omega (x)+\beta \omega (y)), \end{aligned}$$(51)for any \(x,y\in \mathbb {R}^{k}\), such that \(\omega (x)+\omega (y)>R\). We will show that in this case for any \(\beta <1\) we could find \(L\in (0,1)\) such that (51) holds. Let \(\beta <1\). We know that (51) is equivalent to
$$\begin{aligned} 2+2\beta \alpha _{\phi } \le 2L+\beta (L-\phi )(\omega (x)+\omega (y)). \end{aligned}$$Let us assume that \(L>\phi \). Then, it is sufficient to show that
$$\begin{aligned} 2+2\beta \alpha _{\phi } \le 2L+\beta (L-\phi )R, \end{aligned}$$which is equivalent to
$$\begin{aligned} \frac{2+\beta (2\alpha _{\phi }+\phi R)}{2+\beta R}\le L. \end{aligned}$$(52)Consequently, using (50), it is enough to choose any \(L<1\) such that
$$\begin{aligned} L\in \left( \max \left\{ \phi ,\frac{2+\beta (2\alpha _{\phi }+\phi R)}{2+\beta R}\right\} ,1\right) . \end{aligned}$$(53) - Case b):
-
Let \(C_{R}:=\{(x,y)\in \mathbb {R}^{k}\times \mathbb {R}^{k}: \omega (x)+\omega (y)\le R\}\). It is sufficient to show that there exists \(\beta \in (0,1)\) and \(L\in (0,1)\) such that for any \((x,y)\in C_{R}\) and \(f,g\in \mathcal {C}_{\omega }(\mathbb {R}^{k})\) satisfying \(\Vert f\Vert _{\omega {\text {-span}}}\le M\) and \(\Vert g \Vert _{\omega {\text {-span}}}\le M\), we get
$$\begin{aligned} \Vert \mathbb {H}^{f,g}_{x,y} \Vert _{{\text {var}}}+\beta (\phi R+2\alpha _{\phi }) <2L. \end{aligned}$$In fact, it is enough to show that
$$\begin{aligned} \sup _{(x,y)\in C_{R}}\Vert \mathbb {H}^{f,g}_{x,y} \Vert _{{\text {var}}}<2. \end{aligned}$$(54)Indeed, then it is enough to choose any \(\beta <1\) such that
$$\begin{aligned} \beta < \frac{2-\sup _{(x,y)\in C_{R}}\Vert \mathbb {H}^{f,g}_{x,y} \Vert _{{\text {var}}}}{\phi R+2\alpha _{\phi }}, \end{aligned}$$and consider any
$$\begin{aligned} L\in \left( \frac{\sup _{(x,y)\in C_{R}}\Vert \mathbb {H}^{f,g}_{x,y} \Vert _{{\text {var}}}+\beta (\phi R+2\alpha _{\phi })}{2},1\right) . \end{aligned}$$(55)On the contrary, let us assume that (54) is false. Then, there exists a sequence
$$\begin{aligned} (x_n,y_n,f_n,g_n,A_n)_{n\in \mathbb {N}}, \end{aligned}$$for \((x_n,y_n)\in C_{R}\), \(f_n,g_n\in \mathcal {C}_{\omega }(\mathbb {R}^{k})\) and \(A_n\in \mathcal {B}(\mathbb {R}^{k})\), such that \(\Vert f_n\Vert _{\omega {\text {-span}}}\le M\), \(\Vert g_n \Vert _{\omega {\text {-span}}}\le M\) and
$$\begin{aligned} \mathbb {H}^{f_n,g_n}_{x_n,y_n}(A_n)=\bar{\mathbb {Q}}_{(x_n,g_n,h_{(x_n,f_n)})}(A_n)-\bar{\mathbb {Q}}_{(y_n,f_n,h_{(y_n,g_n)})}(A_n)\rightarrow 1. \end{aligned}$$(56)Due to (56) we know that
$$\begin{aligned} \bar{\mathbb {Q}}_{(x_n,g_n,h_{(x_n,f_n)})}(A^c_n)\rightarrow 0\quad {\text {and}}\quad \bar{\mathbb {Q}}_{(y_n,f_n,h_{(y_n,g_n)})}(A_n)\rightarrow 0. \end{aligned}$$(57)Next, for any \(x\in \mathbb {R}^{k}\), \(h\in U\), \(f\in \mathcal {C}_{\omega }(\mathbb {R}^{k})\) and \(A\in \mathcal {B}(\mathbb {R}^{k})\), such that \(\omega (x)\le R\) and \(\Vert f\Vert _{\omega {\text {-span}}}\le M\), using Schwarz inequality we get
$$\begin{aligned}&\bar{\mathbb {Q}}_{(x,f,h)}(A) = \frac{\mathbb {E}\left[ \mathbf {1}_{\{G(x,W_0)\in A\}}e^{\gamma [F(x,h,W_0)+\frac{1}{|\gamma |}f(G(x,W_0))]}\right] }{\mathbb {E}\left[ e^{\gamma [F(x,h,W_0)+\frac{1}{|\gamma |}f(G(x,W_0))]}\right] }\nonumber \\&\qquad =\frac{\mathbb {E}\left[ \mathbf {1}_{\{G(x,W_0)\in A\}}e^{\gamma [F(x,h,W_0) +\frac{1}{|\gamma |}f(G(x,W_0))]}\right] }{\mathbb {E}\left[ e^{\gamma [F(x,h,W_0) +\frac{1}{|\gamma |}f(G(x,W_0))]}\right] }\frac{\mathbb {E}\left[ e^{-\gamma [F(x,h,W_0)+\frac{1}{|\gamma |}f(G(x,W_0))]}\right] }{\mathbb {E}\left[ e^{-\gamma [F(x,h,W_0)+\frac{1}{|\gamma |}f(G(x,W_0))]}\right] }\nonumber \\&\qquad \ge \frac{\mathbb {E}\left[ \mathbf {1}_{\{G(x,W_0)\in A\}}e^{\frac{\gamma }{2}[F(x,h,W_0) +\frac{1}{|\gamma |}f(G(x,W_0))]}e^{-\frac{\gamma }{2}[F(x,h,W_0) +\frac{1}{|\gamma |}f(G(x,W_0))]}\right] ^{2}}{\mathbb {E}\left[ e^{\gamma [F(x,h,W_0) +\frac{1}{|\gamma |}f(G(x,W_0))]}\right] \mathbb {E}\left[ e^{-\gamma [F(x,h,W_0)+\frac{1}{|\gamma |}f(G(x,W_0))]}\right] }\nonumber \\&\qquad \ge \frac{\mathbb {E}\big [\mathbf {1}_{\{G(x,W_0)\in A\}}\big ]^{2}}{e^{2[(Mb_1-\gamma b_2)\omega (x)+M]}\mathbb {E}[e^{Ma_1(W_0)-\gamma a_2(W_0)}]^{2}}\nonumber \\&\qquad \ge \frac{\mathbb {E}\big [\mathbf {1}_{\{G(x,W_0)\in A\}}\big ]^{2}}{e^{2[(Mb_1-\gamma b_2)R+M]}\mathbb {E}[e^{Ma_1(W_0)-\gamma a_2(W_0)}]^{2}}. \end{aligned}$$(58)Combining (57) and (58), we get that
$$\begin{aligned} \mathbb {E}\big [\mathbf {1}_{\{G(x_n,W_0)\in A^{c}_n\}}\big ]\rightarrow 0\quad {\text {and}}\quad \mathbb {E}\big [\mathbf {1}_{\{G(y_n,W_0)\in A_n\}}\big ]\rightarrow 0. \end{aligned}$$On the other hand, from (A.4), for any \(n\in \mathbb {N}\) and \((x_{n},y_{n})\in C_{R}\), we get
$$\begin{aligned} \mathbb {E}\big [\mathbf {1}_{\{G(x_n,W_0)\in A^{c}_n\}}\big ]+\mathbb {E}\big [\mathbf {1}_{\{G(y_n,W_0)\in A_n\}}\big ]\ge c\nu (A_{n}^{c})+c\nu (A_{n})=c>0, \end{aligned}$$where c and \(\nu \) satisfy (8), for \(C_{R}\). This leads to contradiction and in consequence concludes the proof of Case b).
We are now ready to prove (49). Indeed, combining (53) and (55) we conclude that for a given \(M>0\), \(\phi \in (b_1,1)\), \(\alpha _{\phi }>0\) and \(R\in \mathbb {R}\) satisfying (50), it is enough to choose \(\beta <1\) and \(L\in (0,1)\), such that
This concludes the proof of (49). \(\square \)
We are now ready to prove Theorem 1.
Proof
(Proof of Theorem 1) Let \(\gamma <0\). Combining Lemmas 1, 2 and 3 we know that for any fixed M, there exists \(\beta (M)\in (0,1)\) and \(L(M)\in (0,1)\), such that
for any \(f,g\in \mathcal {C}_{\omega }(\mathbb {R}^{k})\) and \(x,y\in \mathbb {R}^{k}\) satisfying \(\Vert f\Vert _{\omega {\text {-span}}}\le M\) and \(\Vert g \Vert _{\omega {\text {-span}}}\le M\). Consequently, for any fixed M, there exists \(\beta (M)\in (0,1)\) and \(L(M)\in (0,1)\), such that
whenever \(\Vert f\Vert _{\omega {\text {-span}}}\le M\) and \(\Vert g \Vert _{\omega {\text {-span}}}\le M\). This concludes the proof of Theorem 1. \(\square \)
Corollary 1
For a given \(\gamma _0<0\) there exists \(\beta : \mathbb {R}_{+}\rightarrow (0,1)\) and \(L: \mathbb {R}_{+} \rightarrow (0,1)\), such that for any \(\gamma \in [\gamma _0,0)\), operator \(T_{\gamma }\) is a local contraction wrt. \(\beta \) and L, i.e. for any \(\gamma \in [\gamma _0,0)\), we get
for \(f_1,f_2\in \mathcal {C}_{\omega }(\mathbb {R}^{k})\), such that \(\Vert f_1\Vert _{\omega {\text {-span}}}\le M\) and \(\Vert f_2 \Vert _{\omega {\text {-span}}}\le M\).
Proof
The proof of Corollary 1 is a direct consequence of the proof of Theorem 1. For transparency, let us briefly explain the idea of the proof.
For clarity let us fix \(M>0\) and consider \(L(M)\in (0,1)\) and \(\beta (M)\in (0,1)\). Let \(\alpha _{\phi }>0\) be such that (48) is satisfied for \(\gamma _0\), i.e.
and let R be such (50) is satisfied for \(\gamma _0\). Then, for any \(\gamma \in [\gamma _0,0)\) we get
Consequently, the choice of \(\alpha _{\phi }\) and R will guarantee (48) and (50), for any \(\gamma \in [\gamma _0,0)\).
Next, we know that \(\beta (M)\) and L(M) are chosen in such a way that (59) is satisfied for \(\gamma _0\), i.e.
Thus, it is sufficient to show that we could find a constant \(a\in (0,2)\) such that
for any \(\gamma \in [\gamma _0,0)\). To do that it is enough to notice that the lower bound for \(\bar{\mathbb {Q}}_{(x,f,h)}\) introduced in (58) is in fact decreasing wrt. \(\gamma \). \(\square \)
Using Theorem 1, i.e. contraction property of operator \(T_{\gamma }\), one can solve Bellman equation (23) and (21).
Proposition 4
Under assumptions (A.1)–(A.4), there exists \(\gamma _0<0\), such that for any \(\gamma \in (\gamma _0,0)\), there exist a unique (up to an additive constant) \(u_{\gamma }\in \mathcal {C}_{\omega }(\mathbb {R}^k)\) and \(\lambda _{\gamma }\in \mathbb {R}\), the solutions to Bellman equation (23).
Proof
Let us fix \(\bar{\gamma }<0\) and let \(M:=\mu ^{0}(a_{2}(W_0))-\mu ^{\bar{\gamma }}(-a_{2}(W_0))+b_2\). We know that for any \(\gamma \in [\bar{\gamma },0)\) we get \(\Vert R_{\gamma }0\Vert _{\omega {\text {-span}}}\le M\), as
For the operator \(T_{\bar{\gamma }}\) and M, let \(\beta (M)\) and L(M) denote corresponding constants from Theorem 1. For simplicity we will write \(\beta \) and L, instead of \(\beta (M)\) and L(M). Let
Noting that \(\gamma _0\in (-1,0)\) and using Corollary 1, for any \(\gamma \in (\gamma _0,0)\), we know that
for \(f_1,f_2\in \mathcal {C}_{\omega }(\mathbb {R}^{k})\), such that \(\Vert f_1\Vert _{\omega {\text {-span}}}\le M\) and \(\Vert f_2 \Vert _{\omega {\text {-span}}}\le M\).
As \(|\gamma |<\beta (1-L)\), it can be easily shown that for any \(n\in \mathbb {N}\) we get \(\Vert T^{n}_{\gamma }0\Vert _{\omega {\text {-span}}}\le M\). Indeed, using (61), we get
Using Banach’s fixed point theorem (see e.g. Hernández-Lerma 1989, Appendix A), we know that there exists at most one fixed point of \(T_{\gamma }\) in \(\mathcal {C}_{\omega }(\mathbb {R}^{k})\) endowed with the \(\omega \)-span norm. Exploiting the fact that \(\Vert T^{n}_{\gamma }0\Vert _{\omega {\text {-span}}}\le M\) for any \(n\in \mathbb {N}\) and the local contraction property of \(T_{\gamma }\) we conclude that there exists a unique \(u_{\gamma }\in \mathbb {C}_{\omega }(\mathbb {R}^{k})\) (up to an additive constant), such that
Consequently, for a fixed \(a\in \mathbb {R}^{k}\), the constant \(\lambda _{\gamma }:=\frac{T_{\gamma }u_{\gamma }(a)-u_{\gamma }(a)}{\gamma }\) and \(u_{\gamma }\in \mathbb {C}_{\omega }(\mathbb {R}^{k})\) are solutions to Bellman equation (23).
Thus, the constant \(\lambda _{\gamma }:=R_{\gamma }v_{\gamma }(0)-v_{\gamma }(0)\) and \(v_{\gamma }\in \mathbb {C}_{\omega }(\mathbb {R}^{k})\) are solutions to Bellman equation (21). \(\square \)
In the end of this Section, let us show a corollary, which will be helpful later. To do so let us fix \(a\in \mathbb {R}^{k}\) and define \(\bar{u}_\gamma (x):=u_\gamma (x)-u_\gamma (a)\) for \(x\in \mathbb {R}^{k}\).
Corollary 2
Under the assumptions and notation of Proposition 4 the functions \((\gamma _0,0)\ni \gamma \mapsto \lambda _{\gamma }\) and \((\gamma _0,0)\ni \gamma \mapsto \bar{u}_{\gamma }(x)\) for each \(x\in \mathbb {R}^{k}\) are continuous.
Proof
Clearly when \(u_\gamma \) is a solution to (23) then \(\bar{u}_\gamma \) is also a solution to (23). By (61) and the proof of Proposition 4 we have that \(\Vert \bar{u}_\gamma \Vert _{\omega {\text {-span}}}\le M\) and
for any \(x\in \mathbb {R}^{k}\) and \(\gamma \) from a compact subinterval of \((\gamma _0,0)\). By Proposition 3 for each m and fixed \(x \in \mathbb {R}^{k}\) the mappings \(\gamma \rightarrow T_{\gamma }^m0(x)\) and \(\gamma \rightarrow T_{\gamma }^m0(a)\) are continuous. Therefore when \(\gamma _n \rightarrow \gamma <0\) we have, using (62), that
For a given \(\epsilon \) we can choose m such that \(c_m\le \epsilon \). Then letting \(n\rightarrow \infty \) for fixed m we obtain continuity of the mapping \(\gamma \rightarrow \bar{u}_\gamma (x)\). Following the proof of Proposition 3 we can also show that the mapping \(\gamma \rightarrow T_{\gamma }\bar{u}_\gamma (x)\) is continuous. Consequently, the mapping \(\lambda \rightarrow \lambda _\gamma ={T_\gamma \bar{u}_\gamma (x)-\bar{u}_\gamma (x)\over \gamma }\) is continuous, which completes the proof. \(\square \)
5 Optimal strategy
It is straightforward to check, that under the assumptions and notation of Proposition 4, we get that \(v_{\gamma }(x)=\frac{u_{\gamma }(x)}{\gamma }\) and \(\lambda _{\gamma }\) are solutions to Bellman equation (21). Finally, we can link Bellman equation (21) and (23) to our initial problem (10).
Proposition 5
Under (A.1)–(A.4), there exists \(\gamma _0<0\), such that for any \(\gamma \in (\gamma _0,0)\), we get
i.e. the optimal value in problem (10) does not exceed the solution of Bellman equation (21). Moreover, if \(a_1\) in the assumption (A.3) is bounded from above, we have that the optimal value in (10) is equal to \(\lambda _{\gamma }\) and the optimal strategy is defined by selectors to the Bellman equation (21).
Proof
This proof could be considered as a variation of the classical verification theorem from the theory of risk sensitive control (see e.g. Hernández-Hernández and Marcus 1996, Theorem 2.1). Let \(\gamma _0\) be given by (60) and for \(\gamma \in (\gamma _0,0)\), let \(u_{\gamma }\) and \(\lambda _{\gamma }\) denote the solutions of Bellman equation (23).
First, we need to show that \(\lambda _{\gamma }\) is an upper bound for any \(\gamma \in (\gamma _0,0)\), i.e. that for any adapted strategy \(H=(H_t)_{t\in \mathbb {T}}\), we get
For \(i\in \mathbb {T}\) and \(p>1\), such that \(\gamma > p\gamma _0\), using (23), we have
Consequently, using the tower property, we get
for any \(t\in \mathbb {T}\). Equivalently, for \(v_{\gamma }(x)=\frac{u_{\gamma }(x)}{\gamma }\), we get
It is hard to get rid of v taking the limit, in the above inequality (note, for the case of bounded v it is straightforward). Using Holder’s inequality we know that for \(q=p/(p-1)\) we get
and consequently (for any \(p>1\)), since \(v_{\gamma \over p}(X_{t})-v_{\gamma \over p}(X_0)\le M (2+\omega (X_{t})+\omega (X_0))\) and \(\lim _{t \rightarrow \infty } \frac{1}{t}\mu ^{q\gamma }(\omega (X_t))=0\) we have
By continuity of \(\gamma \rightarrow \lambda _\gamma \) (see Corollary 2 ), we have that \(\lim _{p\rightarrow 1} \lambda _{\gamma \over p}=\lambda _\gamma \), which shows (64).
Second, we show the optimality of the strategy defined by the Bellman equation (21), when \(a_1\) in (A.3) is bounded from above by \(\tilde{a}\). Let us fix \(\gamma \in (\gamma _0,0)\) and let \(M>0\) be such that \(\Vert v_\gamma \Vert _{\omega }\le M\). For the strategy \(\hat{H}\) determined by the Bellman equation (23), using monotonicity of \(\mu _{\gamma }\), we get
Letting \(t\rightarrow \infty \) we obtain [taking into account (64)]
which completes the second part of the proof. \(\square \)
6 Exemplary dynamics
In this subsection let us present examples of dynamics for which assumptions (A.1)–(A.4) are fulfilled.
Example 1
In this example, we shall set \(\omega \equiv 0\) (equivalently, one might say that \(\omega \) is bounded) and show that our framework covers a wide class of dynamics in the classical case. The first example is taken from Stettner (1999). We will assume that time \(\mathbb {T}=\mathbb {R}_{+}\) is continuous, but we can only reshape our portfolio in discrete time moments \(n\in \mathbb {N}\). With slight abuse of notation, for \(n\in \mathbb {N}\) and (\(z=1,\ldots ,k+m\)), let us assume that \(W_n=(W_{n}^{1},\ldots ,W_{n}^{k+m})\) and \(W^{z}_{n}\) denotes the trajectory of \(w_{z}(t)-w_{z}(n)\) (\(n\le t\le n+1\)), where \(\{w_{z}(t)\}_{z=1}^{k+m}\) are independent Brownian motions (which generate the filtration). Let us assume that the dynamics of the risky assets and factors is given by
where for (\(i=1,\ldots ,m\)), (\(j=1,\ldots , k\)) and (\(z=1,\ldots , k+m\)): \(a_i,b_i:\mathbb {R}^{k}\rightarrow \mathbb {R}\) are measurable and bounded functions, \(b_i\) is continuous, \(\delta _{jz}\in \mathbb {R}\), \(\sigma _{iz}\in \mathbb {R}\) and \({\text {rank}}((\sigma _{iz})_{z=1,\ldots ,k+m})=k\). Let \(h_{i}(t)\) denote the part of the capital invested at time t in the i-th risky asset and let
Moreover, let \(H^{i}_{n}=h_{i}(n)\). Using Ito’s Lemma (see Stettner 1999 for details) we get function F of the form
One can check that assumptions (A.1)–(A.3) will hold in this framework, for \(\omega \equiv 0\). See Stettner (1999), where in fact equivalents of all Propositions from Sect. 4 are directly proved. For clarity, let us show the existence of the upper bound in (A.3), for function F. We get
where \(\Vert a\Vert _{\sup }=\sup _{1\le i\le m}\sup _{x\in \mathbb {R}^{k}}|a_{i}(x)|\) and \(\Vert \sigma \Vert _{\sup }=\sup _{1\le i\le m}\sup _{1\le z\le k+m}|\sigma _{iz}|\).
Thus, is is sufficient to set any \(b_{2}\ge 0\) and
Note, it is easy to check that \(a_2\) will satisfy (6), as for a Gaussian X, we get \(e^{|X|}\in L^1\). Moreover (4) follows from boundedness of b while (8) from nondegeneracy of \(\sigma \) and boundedness of b and in fact one can find a constant c uniform for all \(x\in \mathbb {R}^{k}\). In this example a solution to the Bellman equation (21) is bounded and therefore we obtain in Proposition 5 that \(\lambda _\gamma \) is the optimal value without additional assumptions.
Example 2
We shall now generalize previous example. Namely, let
where \(B:\mathbb {R}^{k} \rightarrow \mathbb {R}^{k}\) is such that \(\Vert B(x)\Vert \le A+b_1\Vert x\Vert \) with \(b_1<1\) and \(C: \mathbb {R}^{k+m}\rightarrow \mathbb {R}^{k}\) is bounded from above of the form
with \(K>0\). Then
where we assume that \(\Vert a_i\Vert _{\omega }<\infty \). Choosing \(\omega (x)=a+b_1\Vert x\Vert \) one can check that all assumptions (A.1)–(A.4) together with boundedness from above of \(a_1\) in (A.3) are satisfied. In particular, assumption (A.4) is satisfied uniformly in \(x\in \mathbb {R}^k\) from compact sets due to the form of G(x, W) and \(C(W_n)\).
Example 3
Let us assume that assumption (A.1) hold and the dynamics of i-th risky assets is given by
for any \(t\in \mathbb {T}\), where \(\xi _{i}\) is a measurable vector function. Moreover the set U will be of the form \(\{(h_1,\ldots ,h_m)\in [0,1]^{m}:\ \sum _{i=1}^{m}h_i\le 1\}\). Then we can define F explicitly, as
To get assumptions (A.2) and (A.3) we need to impose additional assumptions on W and \(\xi _{i}\). In particular we can consider the discretized version of Example 1 by setting \(W_n=(W^1_n,\ldots , W^{k+m}_n)\), where \(W_n^{i}=w_{i}(n+1)-w_{i}(n)\) and
See Stettner (2004) for details in general case and Di Masi and Stettner (2006b) for the case when (65) holds.
References
Bielecki TR, Pliska SR (1999) Risk-sensitive dynamic asset management. Appl Math Optim 39(3):337–360
Bielecki TR, Pliska SR (2003) Economic properties of the risk sensitive criterion for portfolio management. Rev Account Finance 2:3–17
Bielecki TR, Cialenco I, Zhang Z (2014) Dynamic coherent acceptability indices and their applications to finance. Math Finance 24(3):411–441
Bielecki TR, Cialenco I, Pitera M (2015) Dynamic limit growth indices in discrete time. Stoch Models 31(3):494–523
Cavazos-Cadena R, Hernández-Hernández D (2005) A characterization of the optimal risk-sensitive average cost in finite controlled Markov chains. Ann Appl Probab 15(1A):175–212
Cherny AS, Madan DB (2009) New measures for performance evaluation. Rev Financ Stud 22(7):2571–2606
Dai Pra P, Meneghini L, Runggaldier WJ (1996) Connections between stochastic control and dynamic games. Math Control Signals Syst 9(4):303–326
Di Masi GB, Stettner Ł (1999) Risk-sensitive control of discrete-time Markov processes with infinite horizon. SIAM J Control Optim 38(1):61–78
Di Masi GB, Stettner Ł (2006a) On additive and multiplicative (controlled) Poisson equations, vol 72. Banach Center Publications/Polish Academy of Sciences, Warsaw, pp 57–70
Di Masi GB, Stettner Ł (2006b) Remarks on risk neutral and risk sensitive portfolio optimization. In: Kabanov Yu, Liptser R, Stoyanov J (eds) From stochastic calculus to mathematical finance. Springer, Berlin, pp 211–226
Fleming WH, Hernández-Hernández D (1997) Risk-sensitive control of finite state machines on an infinite horizon I. SIAM J Control Optim 35(5):1790–1810
Föllmer H, Schied A (2002) Stochastic finance: an introduction in discrete time. In: De Gruyter Studies in Mathematics, vol 27. Walter de Gruyter & Co., Berlin
Gerber HU (1979) An introduction to mathematical risk theory, vol 8.SS Huebner Foundation for Insurance Education, Wharton School, University of Pennsylvania Philadelphia
Gülten S, Ruszczyński A (2015) Two-stage portfolio optimization with higher-order conditional measures of risk. Ann Oper Res 229(1):409–427
Hairer M, Mattingly JC (2011) Yet another look at Harris’ ergodic theorem for Markov chains. In: Dalang RC, Dozzi M, Russo F (eds) Seminar on stochastic analysis, random fields and applications VI. Progress in Probability, vol 63. Birkhauser/Springer, Basel, pp 109–117
Hernández-Hernández D, Marcus SI (1996) Risk sensitive control of Markov processes in countable state space. Syst Control Lett 29(3):147–155
Hernández-Lerma O (1989) Adaptive Markov control processes. Springer, Berlin
Hernández-Lerma O, Lasserre JB (1996) Discrete-time Markov control processes. Springer, Berlin
Kontoyiannis I, Meyn SP (2003) Spectral theory and limit theorems for geometrically ergodic Markov processes. Ann Appl Probab 13(1):304–362
Kupper M, Schachermayer W (2009) Representation results for law invariant time consistent functions. Math Financ Econ 2(3):189–210
Merton RC (1973) An intertemporal capital asset pricing model. Econometrica 41:867–887
Nagai H (2003) Optimal strategies for risk-sensitive portfolio optimization problems for general factor models. SIAM J Control Optim 41(6):1779–1800
Prigent JL (2007) Portfolio optimization and performance analysis. CRC Press, Boca Raton
Shen Y, Stannat W, Obermayer K (2013) Risk-sensitive Markov control processes. SIAM J Control Optim 51(5):3652–3672
Stettner Ł (1999) Risk sensitive portfolio optimization. Math Methods Oper Res 50(3):463–474
Stettner Ł (2004) Duality and risk sensitive portfolio optimization. Contemp Math 351:333–348
Author information
Authors and Affiliations
Corresponding author
Additional information
Research supported by NCN Grant DEC-2012/07/B/ST1/03298.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Pitera, M., Stettner, Ł. Long run risk sensitive portfolio with general factors. Math Meth Oper Res 83, 265–293 (2016). https://doi.org/10.1007/s00186-015-0528-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00186-015-0528-7