Genetic algorithms approach to feature discretization in artificial neural networks for the prediction of stock price index

https://doi.org/10.1016/S0957-4174(00)00027-0Get rights and content

Abstract

This paper proposes genetic algorithms (GAs) approach to feature discretization and the determination of connection weights for artificial neural networks (ANNs) to predict the stock price index. Previous research proposed many hybrid models of ANN and GA for the method of training the network, feature subset selection, and topology optimization. In most of these studies, however, GA is only used to improve the learning algorithm itself. In this study, GA is employed not only to improve the learning algorithm, but also to reduce the complexity in feature space. GA optimizes simultaneously the connection weights between layers and the thresholds for feature discretization. The genetically evolved weights mitigate the well-known limitations of the gradient descent algorithm. In addition, globally searched feature discretization reduces the dimensionality of the feature space and eliminates irrelevant factors. Experimental results show that GA approach to the feature discretization model outperforms the other two conventional models.

Introduction

For a long time, there has been much research interest in predicting the stock price index. Among them, there are many studies using data mining techniques including artificial neural networks (ANNs). However, most studies showed that ANN had some limitations in learning the patterns because stock market data has tremendous noise and complex dimensionality. ANN has preeminent learning ability while it is often confronted with inconsistent and unpredictable performance for noisy data. In addition, sometimes the amount of data is so large that the learning of patterns may not work well. In particular, the existence of continuous data and large amount of data may pose a challenging task to explicit concepts extraction from the raw data due to the huge amount of data space determined by continuous features (Liu & Setiono, 1996). Many researchers in the society of data mining are interested in the reduction of dimensionality. The reduction and transformation of the irrelevant or redundant features may shorten the running time and yield more generalized results (Dash & Liu, 1997).

This paper proposes a new hybrid model of ANN and genetic algorithms (GAs) for feature discretization to mitigate the above limitations. Feature discretization is to transform continuous values into discrete ones in accordance with certain thresholds. Feature discretization is closely related to the dimensionality reduction (Liu & Motoda, 1998a). Properly discretized data can simplify the process of learning and may improve the generalizability of the learned results. This study uses GA to search the optimal or near-optimal thresholds for feature discretization. In addition, this study simultaneously searches the connection weights between layers in ANN. The genetically evolved connection weights mitigate the well-known limitations of the gradient descent algorithm.

The rest of the paper is organized as follows. Section 2 reviews prior research. Section 3 proposes feature discretization using GA and describes the benefits of the proposed approach. Section 4 describes the research design and experiments. In Section 5, the empirical results are summarized and discussed. In Section 6, conclusions and the limitations of this study are presented.

Section snippets

Prior research on stock market prediction using ANN

Many studies on stock market prediction using artificial intelligence (AI) techniques were performed during the past decade. These studies used various types of ANN to predict accurately the stock index and the direction of its change.

One of the earliest studies, Kimoto, Asakawa, Yoda and Takeoka (1990) used several learning algorithms and prediction methods for developing the Tokyo stock exchange prices index (TOPIX) prediction system. They used the modular neural network to learn the

GA approach to feature discretization for ANN

Many fund managers and investors in the stock market generally accept and use certain criteria for technical indicators as the signal of future market trends. Even if a feature represents a continuous measure, the experts usually interpret the values in qualitative terms such as low, medium, and high (Slowinski & Zopounidis, 1995). For ‘Stochastic %K’, one of the most popular technical indicators, the value of 75 is basically accepted by stock market analysts as a strong signal if the value

Research data and experiments

The research data used in this study is technical indicators and the direction of change in the daily Korea stock price index (KOSPI). The total number of samples is 2928 trading days, from January 1989 to December 1998. Table 2 gives selected features and their formulas (Achelis, 1995, Chang et al., 1996, Choi, 1995, Edwards and Magee, 1997, Gifford, 1995).

The direction of daily change in the stock price index are categorized as “0” or “1”. “0” means that the next day's index is lower than the

Experimental results

Three models are compared according to the methods of determining the connection weights and the methods of feature transformation. Table 5 describes the average prediction accuracy of each model.

In Table 5, GAFD has higher prediction accuracy than BPLT and GALT by 10∼11% for the holdout data. It is worth giving attention to the fact that there is a shade of difference of prediction accuracy between the training data and the holdout data for GAFD. There is, however, a wide difference between

Concluding remarks

As mentioned earlier, previous studies tried to optimize the controlling parameters of ANN using global search algorithms. Some of them only focused on the optimization of the connection weights of ANN. Others had an interest in the optimization of the learning algorithms itself, but most studies had little interest in the dimensionality reduction and the elimination of irrelevant patterns. This paper has proposed a new hybrid GA and ANN to mitigate the above limitations. In this paper, GA not

Acknowledgements

The authors would like to thank Korea Science and Engineering Foundation for supporting this work under Grant No. 98-0102-08-01-3.

References (46)

  • R. Tsaih et al.

    Forecasting S&P 500 stock index futures with a hybrid AI system

    Decision Support Systems

    (1998)
  • S.B. Achelis

    Technical analysis from A to Z

    (1995)
  • H. Adeli et al.

    Machine learning: neural networks, genetic algorithms, and fuzzy systems

    (1995)
  • Ahmadi, H. (1990). Testability of the arbitrage pricing theory by neural networks. Proceedings of the International...
  • R.J. Bauer

    Genetic algorithms and investment strategies

    (1994)
  • J.P. Bigus

    Data mining with neural networks

    (1996)
  • P. Buhlmann

    Extreme events from the return-volume process: a discretization approach for complexity reduction

    Applied Financial Economics

    (1998)
  • J. Chang et al.

    Technical indicators and analysis methods

    (1996)
  • J. Choi

    Technical indicators

    (1995)
  • Choi, J.H., Lee, M.K., & Rhee, M.W. (1995). Trading S&P 500 stock index futures using a neural network. Proceedings of...
  • D.R. Cooper et al.

    Business research methods

    (1995)
  • L. Davis

    Genetic algorithms and financial applications

  • R. Dorsey et al.

    The use of parsimonious neural networks for forecasting financial time series

    Journal of Computational Intelligence in Finance

    (1998)
  • Cited by (569)

    View all citing articles on Scopus
    1

    Tel.: +82-2-958-3613; fax: +82-2-958-3604.

    View full text