Abstract
In this paper, we propose four specifications which can be used for the evaluation of community identification algorithms. Furthermore, a novel algorithm VHITS meeting the four established specifications is presented. Basically, VHITS is based on a two-step approach. In the first step, the Nonnegative Matrix Factorization is used to estimate the community memberships. In the second step, a voting scheme is employed to identify the hubs and authorities of each community. VHITS is then compared to the HITS and PHITS algorithms. Experimental results show that VHITS is more adapted than HITS and PHITS to the task of community identification in citation networks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Zhang, Y., Xu Yu, J., Hou, J.: Web communities: Analysis and construction. Springer, Heidelberg (2006)
Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. Journal of the ACM 46(5), 604–632 (1999)
Cohn, D., Chang, H.: Learning to probabilistically identify authoritative documents. In: 17th International Conference on Machine Learning, pp.167–174 (2000)
Chikhi, N.F., Rothenburger, B., Aussenac-Gilles, N.: A comparison of dimensionality reduction techniques for web structure mining. In: IEEE/WIC/ACM International Conference on Web Intelligence, pp. 116–119 (2007)
Hofmann, T.: Probabilistic latent semantic analysis. In: 15th UAI Conference (1999)
Fisher, M., Everson, R.: When Are Links Useful? Experiments in Text Classification. In: Sebastiani, F. (ed.) ECIR 2003. LNCS, vol. 2633, pp. 41–56. Springer, Heidelberg (2003)
Agresti, A.: An Introduction to Categorical Data Analysis, 2nd edn. Wiley, Chichester (2007)
Lee, D., Seung, H.: Algorithms for non-negative matrix factorization. In: Neural Information Processing Systems, pp. 556–562 (2000)
Chu, M.: Data mining and applied linear algebra. In: International Conference on Informatics Education and Research for Knowledge-Circulating Society, pp. 20–25 (2008)
Lempel, R., Moran, S.: The stochastic approach for link-structure analysis (SALSA) and the TKC effect. Computer Networks 33(1-6), 387–401 (2000)
McCallum, A., Nigam, K., Rennie, J., Seymore, K.: Automating the construction of internet portals with machine learning. Information Retrieval Journal 3, 127–163 (2000)
Zhu, D., Yu, K., Chi, Y., Gong, Y.: Combining content and link for classification using matrix factorization. In: 30th Annual Intl. ACM SIGIR Conference, pp. 487–494 (2007)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chikhi, N.F., Rothenburger, B., Aussenac-Gilles, N. (2008). A New Algorithm for Community Identification in Linked Data. In: Lovrek, I., Howlett, R.J., Jain, L.C. (eds) Knowledge-Based Intelligent Information and Engineering Systems. KES 2008. Lecture Notes in Computer Science(), vol 5177. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85563-7_81
Download citation
DOI: https://doi.org/10.1007/978-3-540-85563-7_81
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85562-0
Online ISBN: 978-3-540-85563-7
eBook Packages: Computer ScienceComputer Science (R0)