Abstract
This paper describes a novel technique, called \(\mathcal{D}\)-walks, to tackle semi-supervised classification problems in large graphs. We introduce here a betweenness measure based on passage times during random walks of bounded lengths. Such walks are further constrained to start and end in nodes within the same class, defining a distinct betweenness for each class. Unlabeled nodes are classified according to the class showing the highest betweenness. Forward and backward recurrences are derived to efficiently compute the passage times. \(\mathcal{D}\)-walks can deal with directed or undirected graphs with a linear time complexity with respect to the number of edges, the maximum walk length considered and the number of classes. Experiments on various real-life databases show that \(\mathcal{D}\)-walks outperforms NetKit [5], the approach of Zhou and Schölkopf [15] and the regularized laplacian kernel [2]. The benefit of \(\mathcal{D}\)-walks is particularly noticeable when few labeled nodes are available. The computation time of \(\mathcal{D}\)-walks is also substantially lower in all cases.
Part of this work was supported by the STRATEGO project funded by the Region wallonne, Belgium.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Callut, J.: First Passage Times Dynamics in Markov Models with Applications to HMM Induction, Sequence Classification, and Graph Mining. Phd thesis dissertation, Universite catholique de Louvain (October 2007)
Chebotarev, P., Shamis, E.: The matrix-forest theorem and measuring relations in small social groups. Automation and Remote Control 58(9), 1505–1514 (1997)
Chebotarev, P., Shamis, E.: On proximity measures for graph vertices. Automation and Remote Control 59(10), 1443–1459 (1998)
Kemeny, J.G., Snell, J.L.: Finite Markov Chains. Springer, Heidelberg (1983)
Macskassy, S.A., Provost, F.: Classi cation in networked data: A toolkit and a univariate case study. J. Mach. Learn. Res. 8, 935–983 (2007)
Newman, M.E.J.: A measure of betweenness centrality based on random walks. Social networks 27, 39–54 (2005)
Norris, J.R.: Markov Chains. Cambridge University Press, United Kingdom (1997)
Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: Bringing order to the web. Technical Report, Computer System Laboratory, Stanford University (1998)
Rabiner, L., Juang, B.-H.: Fundamentals of Speech Recognition. Prentice-Hall, Englewood Cliffs (1993)
Smola, A.J., Kondor, R.: Kernels and regularization on graphs. In: Schölkopf, B., Warmuth, M.K. (eds.) COLT/Kernel 2003. LNCS (LNAI), vol. 2777, pp. 144–158. Springer, Heidelberg (2003)
Szummer, M., Jaakkola, T.: Partially labeled classification with markov random walks. In: Advances in Neural Information Processing Systems, vol. 14, pp. 945–952 (2002)
Tsuda, K., Noble, W.S.: Learning kernels from biological networks by maximizing entropy. Bioinformatics 20(1), 326–333 (2004)
Viger, F., Latapy, M.: Efficient and simple generation of random simple connected graphs with prescribed degree sequence. In: Wang, L. (ed.) COCOON 2005. LNCS, vol. 3595, pp. 440–449. Springer, Heidelberg (2005)
Zhou, D., Huang, J., Schölkopf, B.: Learning from labeled and unlabeled data on a directed graph. In: ICML 2005: Proceedings of the 22nd international conference on Machine learning, pp. 1036–1043. ACM, New York (2005)
Zhou, D., Schölkopf, B.: Learning from labeled and unlabeled data using random walks. In: Rasmussen, C.E., Bülthoff, H.H., Schölkopf, B., Giese, M.A. (eds.) DAGM 2004. LNCS, vol. 3175, pp. 237–244. Springer, Heidelberg (2004)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Callut, J., Françoisse, K., Saerens, M., Dupont, P. (2008). Semi-supervised Classification from Discriminative Random Walks. In: Daelemans, W., Goethals, B., Morik, K. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2008. Lecture Notes in Computer Science(), vol 5211. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87479-9_29
Download citation
DOI: https://doi.org/10.1007/978-3-540-87479-9_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-87478-2
Online ISBN: 978-3-540-87479-9
eBook Packages: Computer ScienceComputer Science (R0)