Elsevier

Journal of Computational Science

Volume 28, September 2018, Pages 294-303
Journal of Computational Science

Exploiting investors social network for stock prediction in China's market

https://doi.org/10.1016/j.jocs.2017.10.013Get rights and content

Highlights

  • We conduct a thorough data analysis on a social network for retail investors.

  • Collective sentiments are correlated with the stock movements.

  • User perceived stock relatedness can capture the implicit stock correlations.

  • The social network features are fused to predict stock movements.

Abstract

Recent works have shown that social media platforms are able to influence the trends of stock price movements. However, existing works have major focused on the U.S. stock market and lacked attention to certain emerging countries such as China, where retail investors dominate the market. In this regard, as retail investors are prone to be influenced by news or other social media, psychological and behavioral features extracted from social media platforms are thought to well predict stock price movements in the China's market. Recent advances in the investor social network in China enables the extraction of such features from web-scale data. In this paper, on the basis of tweets from Xueqiu, a popular Chinese Twitter-like social platform specialized for investors, we analyze features with regard to collective sentiment and perception on stock relatedness and predict stock price movements by employing nonlinear models. The features of interest prove to be effective in our experiments.

Introduction

Social networks such as Twitter, Weibo, Facebook, and LinkedIn have attracted millions of users to post and acquire information, which have been well studied by various works [1], [2], [3], [4]. In addition to these general social networks, there is another breed of smaller, more focused sites that cater to niche audiences. Here we look at a social site designed for traders and investors, that is, Xueqiu. Xueqiu is a specialized social network for Chinese investors of the stock market, and due to the increasing number of retail investors, Xueqiu has attracted millions of users. Xueqiu enables investors to share their opinions on a twitter-like platform, or post their portfolios, demonstrating their trading operations and returns. Different from those general social networks, almost all the information on Xueqiu is related to stocks, making it a natural data source to collect investors’ perceptions, which may be useful for stock market prediction in China.

The literature on stock market prediction was early based on the Efficient Market Hypothesis (EMH) and random walk theory [5]. However, investors’ reactions may not support a random walk model in reality. Behavioral economics has provided plenty of proofs that financial decisions are significantly driven by sentiment. The collective level of optimism or pessimism in society can affect investor decisions [6], [7]. Besides, investor perceptions on the relatedness of stocks can also be a potential predictor. Firms may be economically related with one another [8], [9]. Therefore, there is a probability that one stock's price movement can influence its peer's due to the investment reactions driven by investors’ perceptions on such relatedness.

Sentiment and perception are psychological constructs and thus difficult to measure in archive analyses. News articles have been used as a major source for textual content analysis. For example, news articles are employed to analyze public mood [10], by which stock price movements can be predicted. However, this type of content has an obvious drawback that news articles directly reflect their authors’ sentiment rather than the investors’. Online social platforms have provided us with more direct data and enable opportunities for exploring users’ sentiment and perception. In recent studies, it is found in [11] that collective mood derived from Twitter feeds improved the prediction accuracy of Dow Jones Industrial Average (DJIA). Facebook's Gross National Happiness (GNH) index is shown to have the ability to predict changes in both daily returns and trading volume in the U.S. stock market [12]. The predictability of StockTwits (Twitter-like platform specialized on exchanging trading-related opinions) data with respect to stock price behavior is reported in [13].

Most of the existing studies have focused on the U.S. stock market and lacked attention to certain emerging countries such as China, where the stock market is inefficient exhibiting a considerable non-random walk pattern [14]. The China's stock market (also denoted as the A-share market) differs remarkably from other major markets in the structure of investors. Specifically, unlike other major stock markets, which are dominated by institutional investors, retail investors account for a greater percentage in China's market. Importantly, retail investors are more likely to buy rather than sell stocks that catch their attention and thus tend to be influenced by news or other social medias [15]. Therefore, in this paper, we study the China's stock market based on a unique dataset from a popular Chinese Twitter-like social platform specialized for investors, namely Xueqiu (which means 'snowball’ in Chinese), aiming to fill this gap in the literature.

To demonstrate how closely Xueqiu is related to the China's stock market, Fig. 1(a) shows the daily published tweets volume of all stocks on Xueqiu and the daily trading volume of the A-share market from November 2014 to May 2015. It can be observed that the fluctuation trends of these two curves show great synchronicity, especially when high trading volume volatility occurs. When we look at the individual stocks, the synchronicity between the movement of daily tweets volume and the movement of daily turnover rate still holds, as displayed in Fig. 1(b), where one of the most popular stocks in Xueqiu, that is, the CITIC Securities, is taken as an example. On the basis of the tweets from Xueqiu, we analyze features with regard to collective sentiment and perception. The sentiment and perceived stock relatedness are proposed to be formed on the basis of two types of networks extracted from Xueqiu. One is the user network, and the other is the stock network perceived by users. Combined with the network characteristics, the features can exhibit better predictive performance. In contrast to previous works that only study a small subset of the stocks, we evaluate our proposal on all the active stocks (more than 2000) in the A-share market, indicating it's a feasible approach.

In the remainder of the article, we first briefly introduce related research in Section 2. The online social platform Xueqiu and the crawled dataset are described in Section 3. Then, we describe the methodology in Section 4 and present the experiment of predicting stock price movements in Section 5. Finally, the article is concluded in Section 6.

Section snippets

Stock prediction with historical price data

Most of previous studies utilize historical stock prices to make predictions with various models [16], [17], [18], [19]. A Support Vector Machine-based model is proposed by using the selected subset of financial indexes as the weight inputs [20]. A multi-layer perceptron method is proposed for short-term stock prediction in [21]. Multiple techniques of Artificial Neural Network (ANN) in stock market prediction are evaluated in [22]. However, these works only uses the historical price data and

Data description

This section gives details on the mechanism of Xueqiu and the dataset used in this paper. We also conduct data analysis to show the characteristics of Xueqiu.

Prediction of stock price movement

In this section, we model the prediction of stock price movement as a binary classification problem. Then we discuss how to extract features from three different types of information sources. After that, we evaluate the classification model to verify the effectiveness of the information from Xueqiu.

Experiments

In this section, we conduct experiments to evaluate the effectiveness of using knowledge from Xueqiu to predict stock price movements.

Conclusions and future work

In this study, we have studied a unique social network, namely Xueqiu, where retail investors’ tweets can be employed to extract features such as sentiment and perceived stock relatedness. Further, we adopt SVM model and MLP model to predict the stock price movements in the China's market. The results show that the predictive performance can be improved by including the features of sentiment and perceived stock relatedness. The study contributes to both social network analysis and the

Acknowledgements

This work was supported in part by State Key Development Program of Basic Research of China (No. 2013CB329604), the Natural Science Foundation of China (No. 61370068, 61300014, 61372191), the Project on the Integration of Industry, Education and Research of Guangdong Province (No. 2016B090921001), and DongGuan Innovative Research Team Program (No. 201636000100038).

Xi Zhang received the PhD degree in Computer Science from Tsinghua University. He is an assistant professor in School of Cyberspace Security, Beijing University of Posts and Telecommunications, and is also the vice director of Key Laboratory of Trustworthy Distributed Computing and Service, Ministry of Education, China. He was a visiting scholar at the University of Illinois at Chicago from March 2015 to March 2016. His research interests include big data, data mining, and computer

References (58)

  • R. Genuer et al.

    Variable selection using random forests

    Pattern Recognit. Lett.

    (2010)
  • H. Kwak et al.

    What is twitter, a social network or a news media?

    Proceedings of the 19th International Conference on World Wide Web, ACM

    (2010)
  • Y. Su et al.

    Understanding information diffusion under interactions

    IJCAI

    (2016)
  • A. Anderson et al.

    Global diffusion via cascading invitations: structure, growth, and homophily

  • B. Viswanath et al.

    On the evolution of user interaction in facebook

    Proceedings of the 2nd ACM Workshop on Online Social Networks, ACM

    (2009)
  • E.F. Fama

    The behavior of stock-market prices

    J. Bus.

    (1965)
  • R.R. Prechter
    (1999)
  • J.R. Nofsinger

    Social mood and financial economics

    J. Behav. Finance

    (2005)
  • B.F. King

    Market and industry factors in stock price behavior

    J. Bus.

    (1966)
  • R.S. Pindyck et al.

    The comovement of stock prices

    Q. J. Econ.

    (1993)
  • Y. Karabulut

    Can facebook predict stock market activity?

    AFA 2013 San Diego Meetings Paper

    (2013)
  • A. Al Nasseri et al.

    Big data analysis of stocktwits to predict sentiments in the stock market

    Discovery Science, Springer

    (2015)
  • A.F. Darrat et al.

    On testing the random walk hypothesis: a model comparison approach

    Financ. Rev.

    (2000)
  • B.M. Barber et al.

    All that glitters: the effect of attention and news on the buying behavior of individual and institutional investors

    Rev. Financ. Stud.

    (2008)
  • J.C. Patra et al.

    Computationally efficient FLANN-based intelligent stock price prediction system

    International Joint Conference on Neural Networks, 2009, IJCNN 2009, IEEE

    (2009)
  • H. Jia

    Investigation into the Effectiveness of Long Short Term Memory Networks for Stock Price Prediction

    (2017)
  • Y. Lin et al.

    An SVM-based approach for stock market trend prediction

    The 2013 International Joint Conference on Neural Networks (IJCNN)

    (2013)
  • V. Turchenko et al.

    Short-term stock price prediction using MLP in moving simulation mode

    Proceedings of the 6th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems, vol. 2

    (2011)
  • R. Mahanta et al.

    Optimized radial basis functional neural network for stock index prediction

    2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT)

    (2016)
  • Cited by (54)

    • Effect of twitter investor engagement on cryptocurrencies during the COVID-19 pandemic

      2023, Research in International Business and Finance
      Citation Excerpt :

      The API returned every tweet with the hashtags alongside timestamps; user IDs; and number of retweets, replies, and likes. The number of retweets, replies, and likes are other variables created from the API allowing researchers to analyze the investor engagement, indicating tweet influence as well (Zhang et al., 2018). However, searching the full Twitter archive does not provide information on the volume of tweets posted in a given period.

    • Neural Networks in Forecasting Financial Volatility

      2024, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    View all citing articles on Scopus

    Xi Zhang received the PhD degree in Computer Science from Tsinghua University. He is an assistant professor in School of Cyberspace Security, Beijing University of Posts and Telecommunications, and is also the vice director of Key Laboratory of Trustworthy Distributed Computing and Service, Ministry of Education, China. He was a visiting scholar at the University of Illinois at Chicago from March 2015 to March 2016. His research interests include big data, data mining, and computer architecture. He is the young associate editor of the Frontiers of Computer Science.

    Jiawei Shi received the bachelor degree in 2012. He is now a PhD student in Key Laboratory of Trustworthy Distributed Computing and Service (Beijing University of Posts and Telecommunications), Ministry of Education.

    Di Wang received the master degree in 2017 from Beijing University of Posts and Telecommunications.

    Binxing Fang received his PhD degree from Harbin Institute of Technology, China in 1989. He is a member of the Chinese Academy of Engineering and a professor in School of Cyberspace Security at Beijing University of Posts and Telecommunications. He is currently the chief scientist of State Key Development Program of Basic Research of China. He is also a professor in Institute of Electronic and Information Engineering of UESTC in Guangdong. His current interests include big data, social network analysis, IOT search, and information security.

    View full text