Abstract
According to the Statistical Learning Theory, the support vectors represent the most informative data points and compress the information contained in training set. However, a basic problem in the standard support vector machine is that when the data is noisy, there exists no guaranteed scheme in support vector machines’ formulation to dissuade the machine from learning noise. Therefore, the noise which is typically presents in financial time series data may be taken into account as support vectors. In turn, noisy support vectors are modeled into the estimated function. As such, the inclusion of noise in support vectors may lead to an over-fitting and in turn to a poor generalization. The standard support vector regression (SVR) is reformulated in this article in such a way that the large errors which correspond to noise are restricted by a new parameter \(E\). The simulation and real world experiments indicate that the novel SVR machine meaningfully performs better than the standard SVR in terms of accuracy and precision especially where the data is noisy, but in expense of a longer computation time.
Similar content being viewed by others
Notes
Indeed, the \(\varepsilon \)-insensitive loss, which is a linear loss function, is somewhat immune against the noise relative to the squared loss function. However, the noise has still a great impact on accuracy and precision performances.
Note that this way is not considered as the tuning free parameters.
Of course, it is possible to set the value of \(\varepsilon \) equal to 0. In this way, small errors play a role in the fitting of the curve and sparsity is obtained only by ignoring the large errors.
In this article, the problem of finding an optimal value for \(E\) is not crucially considered. However, choosing the value of \(E\) looks like that of \(\varepsilon \). Optimally choosing \(\varepsilon \) has been addressed by some authors whose methods may similarly be used for choosing an optimal value for \(E\). See for example Smola et al. (1998).
Based on the AR(2) model, it is assumed that the current observation is a function of two recent past observations.
The e1071 is a package included in the software R. It finds solutions for latent class analysis, short time Fourier transform, fuzzy clustering, support vector machines etc.
Determining such the point before approximation and running the machine seems impossible. Therefore, this point is unknown.
Given an RMSE value calculated for each sample path, there are totally a set of 7,000 RMSEs (equal to the number of sample paths). The T-test examines difference between two means of RMSE sets for two machines. The test is also implemented for Bias. Difference between scenarios of noise are also tested (the Table 3).
For the sake of simplicity in reporting, details of the results for the parameter \(\varepsilon \) are not reported in the table.
References
Abu-Mostafa YS, Atiya AF (1996) Introduction to financial forecasting. Appl Intell 6:205–213
Campbell J, Lo A, MacKinlay A (1997) The econometrics of financial markets. Princeton University Press, Princeton
Cherkassky V, Ma Y (2002) Selection of meta-parameters for support vector regression. In: Dorronsoro JR (ed) Proceedings of the international conference artificial neural networks: ICANN 2002, Madrid, Spain, pp 687–693
Cherkassky V, Ma Y (2004) Comparison of loss functions for linear regression. IEEE, New York
Cherkassky V, Ma Y (2004) Practical selection of SVM parameters and noise estimation for SVM regression. Neural Netw 17:113–126
Collobert R, Sinz F, Weston J, Bottou L (2004) Trading convexity for scalability. In: Proceedings of the 23rd international conference on machine learning, Pittsburgh, PA
Enders W (2009) Applied econometric time series, 3 edn, Wiley series in probability and statistics
Evgeniou T, Pontil M, Poggio T (2000) Statistical learning theory: a primer. Int J Comput Vis 38(1):9–13
Giles CL, Lawrence S, Tsoi AC (2001) Noisy time series prediction using a recurrent neural network and grammatical inference. Mach Learn 44(1/2):161–183
Hasti T, Tibshirani R, Friedman J (2008) The elements of statistical learning: data mining, inference, and prediction. Springer series in statistics (second Edition), CA
Kim KJ (2003) Financial time series forecasting using support vector machines. Neurocomputing 55:307–319
Lütkepohl H, Kraetzig M, Phillips PCB (2004) Applied time series econometrics. Cambridge University Press, Cambridge
Magdon-Ismail M, Nicholson A, Abu-Mostafa YS (1998) Financial markets: very noisy information processing. Proc IEEE 86(11):2184–2195
Malkeil B (1985) A random walk down wall street. Norton, New York
Mandelbrot B (1983) The fractal geometry of nature. W.H. Freeman, San Francisco B.
Rachev ST, Menn C, Fabozzi FJ (2005) Fat-tailed and skewed asset return distributions: implications for risk management, portfolio selection, and option pricing. Wiley Finance
Samorodnitsky G, Taqqu MS (1994) Stable non-Gaussian random processes: stochastic models with infinite variance. Chapman and Hall, London
Smola AJ, Murata N, Schölkopf B, Müller K-R (1998) Asymptotically choice of \(\varepsilon \)-loss for support vector machines. In: Proceedings of the international conference on artificial neural networks
Sun W, Rachev S, Fabozzi FJ (2007) Fractals or I.I.D.: evidence of long-range dependence and heavy tailedness from modeling German market returns. J Econ Bus 59:575–595
Tay FEH, Cao L (2001) Application of support vector machines in financial time series forecasting. Omega 29:309–317
Wahba G (1990) Splines models for observational data, series in applied mathematics, vol 59, Philadelphia
Vapnik VN (1998) Statistical learning theory. Wiley, New York
Vapnik VN, Chervonenkis AY (1971) On the uniform convergence of relative frequencies of events to their probabilities. Theory Probab Its Appl 17(2):264–280
Yang H, Chan L, King I (2002) Support vector machine regression for volatile stock market prediction. Springer, Berlin, IDEAL, pp 391–396
Yang H, Huang K, Chan L, King I, Lyu MR (2004) Outliers treatment in support vector regression for financial time series prediction. Springer, Berlin, ICONIP, pp 1260–1265
Yuille AL, Rangarajan A (2004) The concave–convex procedure (CCCP), advances in neural information processing systems, 14. MIT Press, Cambridge, MA
Acknowledgments
I would like to thank the Editors and Referees of the journal for their instructive review of the paper. Their suggestions led to further research findings. I also thank Prof. Dr. Detlef Seese, Institute AIFB, Karlsruhe Institute of Technology (KIT) and Dr. Rahim Mahmudvand, Department of Statistics, Faculty of Mathematical Sciences, Shahid Beheshti University for their useful comments.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Safari, A. An e–E-insensitive support vector regression machine. Comput Stat 29, 1447–1468 (2014). https://doi.org/10.1007/s00180-014-0500-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-014-0500-7