Portfolio Research Based on Mean-Realized Variance-CVaR and Random Matrix Theory under High-Frequency Data ()
1. Introduction
Mean-variance model (MV model) proposed by Markowitz (1952) opened a new chapter in modern portfolio theory, and subsequently many scholars are devoted to expand and deepen it. Kolm et al. (2014) summarized the development, challenges and future development directions. In Markowitz’s MV model, the mean and variance are used to measure average return and risk of asset portfolio respectively. The calculation method of variance depends on characteristics of data. Based on the difference of frequency in data collection, data can be divided into low-frequency data and high-frequency data. As we know, high-frequency trading data with short-time-span could reduce the loss of information in financial market. And it is easier to be obtained with the rapid development of technology. Therefore, the research on investment strategy based on high-frequency data becomes necessary and much significant. The realized variance can be calculated by realized volatility proposed by Andersen & Bollerslev (1998). The literature about realized volatility (variance) is rich, most of which are devoted to its modification, expansion and application in financial high-frequency data. Scholars also introduced realized (variance) covariance into asset allocation research, e.g., Pooter et al. (2008), Yao (2010), Song & Hu (2017) and Yin (2016) etc.
In the above portfolio study with realized variance, only one risk factor (variance) is considered. Since different risk measures describe different risk character of assets, scholars considered multiple measures to construct multi-objective portfolio optimization model. The earlier studies are mean-absolute deviation-skewness model and mean-variance-skewness model in Konno et al. (1993, 1995). Due to excellent properties of CVaR, Roman et al. (2007) constructed a mean-variance-CVaR model which could result in a more balanced portfolio. Further, Li et al. (2012) and Yu & Ma (2014) used this model to study China’s foreign exchange reserves and sovereign fund investment respectively. Gao et al. (2016) extended it to dynamic situations in financial market and Shi et al. (2019) considered optimal investment and reinsurance problem in continuous time. However, the data in above literatures is low-frequency data, and the situation of high-frequency data is ready to be explored.
In addition, with the increasing complexity and diversity of financial markets, Laloux et al. (1999) and Plerou et al. (1999) first applied random matrix theory (RMT) to stock market, which demonstrated the existence of “noise” in asset correlation matrix and effect on portfolio strategy. Later, RMT is used in the study of financial risk management to improve information quality of financial market, for example Han et al. (2014), Xie et al. (2018), Bun et al. (2017) and Shen et al. (2019) etc. Li & Hong (2019) studied the stability of the network before and after “denoising” based on random matrix theory and effective frontier of portfolio under mean-variance model.
In summary, this paper will construct mean-realized variance-CVaR portfolio model, and discuss the influence of denoising technology and realized covariance on optimal multi-objective optimization strategy. The paper is organized as follows. Section 2 describes the related methods. Section 3 gives the datasets, empirical procedure and the out-of-sample performance of different portfolio strategies. Finally, Section 4 concludes the paper.
2. Methods
For convenience, we first give some notations.
: Realized covariance matrix
: Covariance matrix
: Realized covariance matrix after denoising of
: Realized covariance matrix after denoising of
: Diagonal matrix of realized standard deviation
: Diagonal matrix of standard deviation
: Asset correlation matrix based on
: Pearson Correlation Coefficient Matrix
: Asset correlation matrix after denoising of
: Asset correlation matrix after denoising of
2.1. Realized Covariance Matrix
We consider the price process of an
dimensional financial assets
, where
represents the price of the j asset at time t. The logarithm price vector is
. (1)
The return vector as follows:
. (2)
Assume that time period from t to
is divided into m segments, and the rate of asset return on each segment is
,
. So return matrix from time t to
is described as:
. (3)
Therefore, the realized covariance matrix (
) can be defined as:
, (4)
here the value of main diagonal element of realized covariance matrix is the realized variance of each asset.
2.2. Random Matrix Theory Denoising
2.2.1. Noise Detection
A random matrix is expressed as:
, (5)
where A is an
matrix which is composed of N uncorrelated random variables with sequence length L, and each sequence obeys
distribution. Based on Wigner (1951), for the window width
, the predicted maximum and minimum eigenvalue of random matrix can be expressed as:
, (6)
where
is the variance of
, and
for standardized matrix.
Based on Kenett et al. (2009), eigenvalue Entropy (SE) of a random matrix is an effective tool to evaluate the information contained in the eigenvalue, as follows:
, (7)
where
(
) represents the eigenvalue of matrix. The smaller SE means less noise information, which shows more economic information is contained in eigenvalues, and vice versa.
2.2.2. Denoising Method
For an
asset correlation matrix C (
or
)
, (8)
where D is a diagonal matrix formed by the eigenvalue
, and E is the eigenvector matrix of C. Let
, thenA is the noise set. Here PG+ denoising method will be employed. All the elements of set A are replaced by 0 to construct the new diagonal matrix
. Then the denoised asset correlation matrix
(
or
) can be expressed as:
. (9)
We set the diagonal element of
to be 1 to ensure that
.
As we know, the covariance matrix
(
or
) and the asset correlation matrix C (
or
) satisfies the following relationship
, (10)
where
(or
) represents the diagonal matrix formed by standard deviation of each asset. So the
after denoising could be obtained through the
.
2.3. Mean-Realized Variance-CVaR Optimization Model
Suppose that
is the return vector of N assets, and
is the weight vector. The variance of the cumulative return of portfolio is
, where M might be the realized covariance matrix
, the realized covariance matrix after noise reduction (
) or the covariance matrix after noise reduction (
).
Based on Roman et al. (2007), we will study the following problem:
(11)
where d represents the investor’s target return rate, z represents the control of CVaR,
is the weight constraint for full investment,
tells that no short-selling permitted. The specific determination of parameters d and z is shown in Appendix A.
To show the impact of random matrix and realized variance on investment strategies, the following three optimization models are arranged in this paper, see Table 1.
2.4. Model Evaluation
The average return:
, (12)
where
represents the out sample data, and
represents the optimal investment weight.
Omega Ratio (OR) is proposed by Keating & Shadwick (2002), defined as:
, (13)
where
represents the cumulative distribution function of portfolio returns and
is a specified threshold. Returns below the specific threshold are considered as losses and returns above as gains. For the convenience of calculation,
is assumed ( Clemente et al., 2019). The portfolio with the highest ratio will be preferred by an investor.
3. Empirical Study
3.1. Dataset Description
The database is from Shanghai Stock Exchange 180 (SSE 180) Index, consisting of 180 stocks that best represent China’s A-Share Market. The five-minute return data of 120 stocks is collected from July 1, 2019 to August 10, 2019. And their five-minute logarithmic returns are calculated respectively. The data spanning from July 1, 2019 to July 31, 2019 is marked as in-sample data and the rest for out-of-sample data.
3.2. Empirical Procedure
The empirical study will be processed according to the following procedure.
Step 1: calculating realized covariance matrix
based on formula (1)-for- mula (4).
Step 2: “noise” detection. The noise information in asset correlation matrix C (
or
) and random matrix Zwill be analyzed by eigenvalue entropy (SE)based on formulas (5), (6) and (7).
Step 3: constructing the denoised covariance correlation
(
or
) according to formulas (8), (9) and (10).
Step 4: calculating the optimal asset weights under MRVC (denoise), MRVC and MVC (denoise) based on model (11) respectively.
(1) Assumed that
is the median of the average return of all assets, and the interval
is determined by formulas from (14) to (18).
takes the
,
,
and
quantile value of this interval respectively, denoted as
,
,
and
.
(2) Find the optimal weight
of assets under mean-variance model based on
in last step.
(3) Based on the given
and
, the interval
of z is determined by formulas (19) and (20) for
. The values of z is assumed to be the
,
,
of quantile values of interval
and
respectively, denoted as
,
,
and
.
(4) Problems of
,
and
with the given
and z will be solved through cvx toolkit in Matlab, which result in the optimal solution
.
Step 5: the in-sample optimal weight of assets with three models are obtained from step 1 to step 4. Further, the average returns and OR values for out-of-sample dataset are calculated by formulas (12) and (13).
3.3. Empirical Results
3.3.1. Characteristic Analysis of Asset Correlation Matrix
We calculate the asset correlation matrix C (
or
), and further detect their noises based on Step 1 to Step 2, shown in Table 2 and Table 3.
From Table 2, we find the maximum (minimum) eigenvalue 44.32 (0.01) of
Table 2. Characteristic analysis of asset correlation matrix and random matrix.
Table 3. SE of asset correlation matrix and random matrix.
Notes: the symbol “A” means that all eigenvalues are considered while “B” for removing the maximum 7 eigenvalues.
matrix
and its corresponding random matrix’s maximum (minimum) eigenvalue 2.43 (0.19) are greater (smaller) than
’s maximum (minimum) eigenvalue 20.84 (0.25) and its corresponding random matrix’s maximum (minimum) eigenvalue 1.77 (0.45). This tells us that matrix
has smaller noise interval. Meanwhile, compared
with
, the percentage of noise in
is smaller, which means that matrix
contains more useful economic information.
It can be seen from Table 3 that the SE of the asset correlation matrix C (
or
) is much smaller than its corresponding random matrix, which means that C (
or
) contains more economic information than its random matrix. After removing the eigenvalues greater than
in
and
respectively, SE rises sharply, which indicates that removing larger eigenvalues might reduce the information of asset correlation matrix. Therefore, we only replace eigenvalues less than 5 with 0 when PG+ method is used.
3.3.2. Out-of-Sample Performance of Optimal Asset Allocation
Based on Step 3 to Step 4 in Section 3.2, we can obtain the optimal investment strategy under each model with different parameters, and the average return and OR values are shown in Table 4.
The following results could be found from Table 4.
1) Under any different constraints
of means and CVaR, both average return and OR of MRVC (denoise) are higher than MVC (denoise), which means that the introduction of realized covariance matrix for high-frequency data can help much for more effective market information and more appropriate investment decision.
2) Compared with MRVC model, the average return and OR of MVC (denoise) are improved mostly, which tells us that the use of random matrix can indeed improve the performance of investment portfolios to some extent. And the performance under MRVC (denoise) model is sensitive to the selection of parameters
and z.
To further understand out-of-sample performance of each model under different parameter, we plot the cumulative return with the optimal portfolio weights, see Figure 1.
From Figure 1 we can give the following conclusion.
1) For any different
, MRVC (denoise) performs the best and MRVC worst. This shows that the combined use of realized covariance matrix and random matrix theory in optimization model can better improve performance of portfolio.
2) There is little difference among three models when the market fluctuates slightly in the early stage. However, MRVC (denoise) begins to highlight its superiority when the market fluctuates sharply.
3) At a fixed return target, the superiority of MRVC (denoise) gradually increases with the relaxation of constraint on risk CVaR.
Table 4. Out-of-sample performance of optimal investment strategy.
Figure 1. Cumulative return graph of investment period after (before) denoising.
4. Conclusion
This paper studies multi-objective investment strategy based on mean-realized variance-CVaR and random matrix theory for high-frequency data. Compared with Roman et al. (2007), the innovation of this paper is the introduction of covariance matrix and random matrix theory in optimization problem ( Clemente et al., 2019). Compared with Li & Hong (2019), this paper considered CVaR and variance as factors of risk control simultaneously. To a certain extent, the new model can better deal with high frequency, noise and thick-tail characters of data in financial market. The empirical study found that the noise percentage in asset correlation matrix with realized covariance matrix is significantly reduced, and hence carries more effective information. The out-of-sample performance of MRVC (denoise) is significantly better than the other two models, which tells us that the use of realized covariance matrix and random matrix might help to improve information quality and effectiveness of high–frequency data in investment problem. Because of the limitation of length, this paper only considers five-minute return data of 120 stocks, and the relationship between different high-frequency data, denoising effect, and covariance matrix estimator can also be a direction for future research.
Acknowledgments
This work was partially supported by National Natural Science Foundation of China under Grant no. 71671104 and 11971301.
Appendix A
Based on Mean-Variance-CVaR model in Roman et al. (2007),
can be written as follow,
, (14)
Thus formula (11) can be rewritten as
:
(15)
Here v is the value of
,
represents the probability of return rate
at time i,
,
represents the return rate of asset j at time i, and
is the expected return rate of asset j.
In order to ensure
has a feasible solution, d and z need to be within a certain range, that is,
,
, where
.
is determined by
:
(16)
Solving the
to get
, thus
.
is determined by:
(17)
Solving the
to get
, thus
.
is determined by:
(18)
Solving
to get
, thus
.
Here
is determined by the model:
(19)
Solving
to get
and
, thus
.
is determined by:
(20)
Solving
to get
, thus
. Here
is the optimal portfolio weight of the solution when the mean constraint is
in mean-variance model.