Analysis of Twitter Data Using a Multiple-level Clustering Strategy

Baralis, Elena; Cerquitelli, Tania; Chiusano, Silvia; Grimaudo, Luigi; Xiao, Xin

doi:10.1007/978-3-642-41366-7_2

Analysis of Twitter Data Using a Multiple-level Clustering Strategy

Elena Baralis¹⁸,
Tania Cerquitelli¹⁸,
Silvia Chiusano¹⁸,
Luigi Grimaudo¹⁸ &
…
Xin Xiao¹⁸

Conference paper

1419 Accesses
15 Citations

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 8216))

Abstract

Twitter, currently the leading microblogging social network, has attracted a great body of research works. This paper proposes a data analysis framework to discover groups of similar twitter messages posted on a given event. By analyzing these groups, user emotions or thoughts that seem to be associated with specific events can be extracted, as well as aspects characterizing events according to user perception. To deal with the inherent sparseness of micro-messages, the proposed approach relies on a multiple-level strategy that allows clustering text data with a variable distribution. Clusters are then characterized through the most representative words appearing in their messages, and association rules are used to highlight correlations among these words. To measure the relevance of specific words for a given event, text data has been represented in the Vector Space Model using the TF-IDF weighting score. As a case study, two real Twitter datasets have been analysed.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bender, M., Crecelius, T., Kacimi, M., Michel, S., Neumann, T., Parreira, J., Schenkel, R., Weikum, G.: Exploiting social relations for query expansion and result ranking. In: IEEE 24th Int. Conf. on Data Engineering Workshop, pp. 501–506 (2008)
Google Scholar
Cagliero, L., Fiori, A.: Generalized association rule mining from Twitter. Intelligent Data Analysis 17(4) (2013)
Google Scholar
Cheong, M., Lee, V.: Integrating web-based intelligence retrieval and decision-making from the twitter trends knowledge base. In: 2nd ACM Workshop on Social Web Search and Mining, pp. 1–8 (2009)
Google Scholar
Lopes, A.A., Pinho, R., Paulovich, F.V., Minghim, R.: Visual text mining using association rules. Comput. Graph. 31(3), 316–326 (2007)
Article Google Scholar
Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Knowledge Discovery and Data Mining (KDD), pp. 226–231 (1996)
Google Scholar
Steinbach, M., Karypis, G., Kumar, V.: A comparison of document clustering techniques. In: KDD Workshop on Text Mining (2000)
Google Scholar
Pang-Ning, T., Steinbach, M., Kumar, V.: Introduction to Data Mining. Addison-Wesley (2006)
Google Scholar
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: SIGMOD 2000, Dallas, TX (May 2000)
Google Scholar
Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Computational and Applied Mathematics, 53–65 (1987)
Google Scholar
Antonelli, D., Baralis, E., Bruno, G., Cerquitelli, T., Chiusano, S., Mahoto, N.: Analysis of diabetic patients through their examination history. Expert Systems with Applications 40(11) (2013)
Google Scholar
Kaufman, L., Rousseeuw, P.J.: Finding groups in data: An introduction to cluster analysis. Wiley (1990)
Google Scholar
DBDMG (2013), http://dbdmg.polito.it/wordpress/research/analysis-of-twitter-data-using-a-multiple-level-clustering-strategy/
Rapid Miner Project, The Rapid Miner Project for Machine Learning (2013), http://rapid-i.com/ (last access on January 2013)
Li, X., Guo, L., Zhao, Y.: Tag-based social interest discovery. In: 17th Int. Conf. on World Wide Web, pp. 675–684 (2008)
Google Scholar
Yin, Z., Li, R., Mei, Q., Han, J.: Exploring social tagging graph for web object classification. In: 15th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, pp. 957–966 (2009)
Google Scholar
Heymann, P., Ramage, D., Garcia-Molina, H.: Social tag prediction. In: 31st Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, pp. 531–538 (2008)
Google Scholar
Alvanaki, F., Michel, S., Ramamritham, K., Weikum, G.: See what’s enblogue - real-time emergent topic identification in social media. In: 15th Int. Conf. on Extending Database Technology, pp. 336–347 (2012)
Google Scholar
Mathioudakis, M., Koudas, N.: Twittermonitor: trend detection over the twitter stream. In: ACM Int. Conf. on Management of Data, pp. 1155–1158 (2010)
Google Scholar
Li, X., Guo, L., Zhao, Y.E.: Tag-based social interest discovery. In: 17th Int. Conf. on World Wide Web, pp. 675–684 (2008)
Google Scholar
Chen, Q., Shipper, T., Khan, L.: Tweets mining using wikipedia and impurity cluster measurement. In: Int. Conf. Intelligence and Security Informatics, pp. 141–143 (2010)
Google Scholar
Kim, S., Jeon, S., Kim, J., Park, Y.-H.: Finding core topics: Topic extraction with clustering on tweet. In: IEEE Int. Conf. on Cloud and Green Computing, pp. 777–782 (2012)
Google Scholar
Subramani, K., Velkov, A., Ntoutsi, I., Kroger, P.: Density-based community detection in social networks. In: IEEE Int. Conf. on Internet Multimedia Systems Architecture and Application, pp. 1–8 (2011)
Google Scholar

Download references

Author information

Authors and Affiliations

Dipartimento di Automatica e Informatica, Politecnico di Torino, Torino, Italy
Elena Baralis, Tania Cerquitelli, Silvia Chiusano, Luigi Grimaudo & Xin Xiao

Authors

Elena Baralis
View author publications
You can also search for this author in PubMed Google Scholar
Tania Cerquitelli
View author publications
You can also search for this author in PubMed Google Scholar
Silvia Chiusano
View author publications
You can also search for this author in PubMed Google Scholar
Luigi Grimaudo
View author publications
You can also search for this author in PubMed Google Scholar
Xin Xiao
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

ICAR-CNR and University of Calabria, 87036, Cosenza, Italy
Alfredo Cuzzocrea
INRIA-Bordeaux Sud Ouest, France
Sofian Maabout

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Baralis, E., Cerquitelli, T., Chiusano, S., Grimaudo, L., Xiao, X. (2013). Analysis of Twitter Data Using a Multiple-level Clustering Strategy. In: Cuzzocrea, A., Maabout, S. (eds) Model and Data Engineering. MEDI 2013. Lecture Notes in Computer Science, vol 8216. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41366-7_2

Download citation

DOI: https://doi.org/10.1007/978-3-642-41366-7_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41365-0
Online ISBN: 978-3-642-41366-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics