Abstract
We present a supervised multi-label classification method for automatic image annotation. Our method estimates the annotation labels for a test image by accumulating similarities between the test image and labeled training images. The similarities are measured on the basis of sparse representation of the test image by the training images, which avoids similarity votes for irrelevant classes. Besides, our sparse representation-based multi-label classification can estimate a suitable combination of labels even if the combination is unlearned. Experimental results using the PASCAL dataset suggest effectiveness for image annotation compared to the existing SVM-based multi-labeling methods. Nonlinear mapping of the image representation using the kernel trick is also shown to enhance the annotation performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Csurka, G., Dance, C.R., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision, ECCV (2004)
Joachims, T.: Text categorization with support vector machines: learning with many relevant features. In: Nédellec, C., Rouveirol, C. (eds.) Proceedings of ECML-1998, 10th European Conference on Machine Learning, pp. 137–142. Springer, Heidelberg (1998)
Hsu, C.W., Lin, C.J.: A comparison of methods for multiclass support vector machines. IEEE Transactions on Neural Networks 13, 415–425 (2002)
Kressel, U.H.G.: Pairwise classification and support vector machines. MIT Press, Cambridge (1999)
Bucak, S.S., Mallapragada, P.K., Jin, R., Jain, A.K.: Efficient multi-label ranking for multi-class learning: approach to object recognition. In: International Conference on Computer Vision (2009)
Wright, J., Yang, A.Y., Ganesh, A., Sastry, S.S., Ma, Y.: Robust face recognition via sparse representation. IEEE Transactions on Pattern Analysis and Machine Intelligence 31, 210–227 (2009)
Wang, C., Yan, S., Zhang, L., Zhang, H.J.: Multi-label sparse coding for automatic image annotation. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 0, pp. 1643–1650 (2009)
Hsu, D., Kakade, S., Langford, J., Zhang, T.: Multi-label prediction via compressed sensing. In: 23rd Annual Conference on Neural Information Processing Systems (2009)
Donoho, D.: Compressed sensing. IEEE Trans. Information Theory 52, 1289–1306 (2006)
Candès, E.J., Romberg, J., Tao, T.: Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Transactions on Information Theory 52, 489–509 (2006)
Candès, E.J., Romberg, J., Tao, T.: Stable signal recovery from incomplete and inaccurate measurements. Comm. on Pure and Applied Math. 59, 1207–1223 (2006)
Candès, E.J.: The restricted isometry property and its implications for compressed sensing. Comptes Rendus Mathematique 346, 589–592 (2008)
Candès, E.J., Wakin, M.B.: An introduction to compressive sampling. IEEE Signal Processing Magazine, 21–30 (March 2008)
Gribonval, R., Nielsen, M.: Sparse representations in unions of bases. IEEE Transactions on Information Theory 49, 3320–3325 (2003)
Donoho, D., Elad, M.: Optimally sparse representation in general (non-orthogonal) dictionaries via l 1 minimization. Proc. the National Academy of Sciences of the United States of America, 2197–2202 (2003)
Candès, E.J., Tao, T.: Near-optimal signal recovery from random projections: Universal encoding strategies? IEEE Transactions on Information Theory 52, 5406–5425 (2006)
Candès, E.J., Tao, T.: Decoding by linear programming. IEEE Transactions on Information Theory 51, 4203–4215 (2005)
Rudelson, M., Vershynin, R., Rudelson, M., Vershynin, R.: Geometric approach to error correcting codes and reconstruction of signals. Int. Math. Res. Not. 64, 4019–4041 (2005)
Chen, S.S., Donoho, D.L., Saunders, M.A.: Atomic decomposition by basis pursuit. SIAM J. Sci. Comput. 20, 33–61 (1998)
Tibshirani, R.: Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B 58, 267–288 (1996)
Kim, S.J., Koh, K., Lustig, M., Boyd, S., Gorinevsky, D.: An interior-point method for large-scale l 1-regularized least squares. IEEE Journal on Selected Topics in Signal Processing 1, 606–617 (2007)
Figueiredo, M.A.T., Nowak, R.D., Wright, S.J.: Gradient projection for sparse reconstruction: Application to compressed sensing and other inverse problems. IEEE Journal of Selected Topics in Signal Processing 1, 586–597 (2007)
Tomioka, R., Sugiyama, M.: Dual augmented lagrangian method for efficient sparse reconstruction. Technical report, arXiv:0904.0584, (preprint, 2009)
Pati, Y.C., Rezaiifar, R., Rezaiifar, Y.C.P.R., Krishnaprasad, P.S.: Orthogonal matching pursuit: Recursive function approximation with applications to wavelet decomposition. In: Proceedings of the 27th Annual Asilomar Conference on Signals, Systems, and Computers, pp. 40–44 (1993)
Tropp, J.A., Anna, G.C.: Signal recovery from random measurements via orthogonal matching pursuit. IEEE Trans. Information Theory 53, 4655–4666 (2007)
Needell, D., Vershynin, R.: Uniform uncertainty principle and signal recovery via regularized orthogonal matching pursuit. Foundations of Computational Mathematic 9, 317–334 (2009)
Needell, D., Tropp, J.A.: CoSaMP: Iterative signal recovery from incomplete and inaccurate samples. Applied and Computational Harmonic Analysis 26, 301–321 (2009)
Mallat, S., Zhang, Z.: Matching pursuit with time-frequency dictionaries. IEEE Transactions on Signal Processing 41, 3397–3415 (1993)
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The Pascal Visual Object Classes (VOC) Challenge. International Journal of Computer Vision 88, 303–338 (2010)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60, 91–110 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sakai, T., Itoh, H., Imiya, A. (2011). Multi-label Classification for Image Annotation via Sparse Similarity Voting. In: Koch, R., Huang, F. (eds) Computer Vision – ACCV 2010 Workshops. ACCV 2010. Lecture Notes in Computer Science, vol 6469. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22819-3_35
Download citation
DOI: https://doi.org/10.1007/978-3-642-22819-3_35
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-22818-6
Online ISBN: 978-3-642-22819-3
eBook Packages: Computer ScienceComputer Science (R0)