Abstract
Fine-grained visual categorization aims at classifying visual data at a subordinate level, e.g., identifying different species of birds. It is a highly challenging topic receiving significant research attention recently. Most existing works focused on the design of more discriminative feature representations to capture the subtle visual differences among categories. Very limited efforts were spent on the design of robust model learning algorithms. In this paper, we treat the training of each category classifier as a single learning task, and formulate a generic multiple task learning (MTL) framework to train multiple classifiers simultaneously. Different from the existing MTL methods, the proposed generic MTL algorithm enforces no structure assumptions and thus is more flexible in handling complex inter-class relationships. In particular, it is able to automatically discover both clusters of similar categories and outliers. We show that the objective of our generic MTL formulation can be solved using an iterative reweighted ℓ2 method. Through an extensive experimental validation, we demonstrate that our method outperforms several state-of-the-art approaches.
Chapter PDF
Similar content being viewed by others
References
Angelova, A., Zhu, S.: Efficient object detection and segmentation for fine-grained recognition. In: CVPR (2013)
Argyriou, A., Evgeniou, T., Pontil, M.: Multi-task feature learning. In: NIPS (2007)
Argyriou, A., Evgeniou, T., Pontil, M.: Convex multi-task feature learning. Mach. Learn. 73(3), 243–272 (2008)
Babenko, B., Branson, S., Belongie, S.: Similarity metrics for categorization: From monolithic to category specific. In: ICCV (2009)
Bar-Hillel, A., Weinshall, D.: Subordinate class recognition using relational object models. In: NIPS (2006)
Bart, E., Porteous, I., Perona, P., Welling, M.: Unsupervised learning of visual taxonomies. In: CVPR (2008)
Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Img. Sci. 2(1), 183–202 (2009)
Berg, T., Belhumeur, P.N.: How do you tell a blackbird from a crow? In: ICCV (2013)
Berg, T., Liu, J., Lee, S.W., Alexander, M.L., Jacobs, D.W., Belhumeur, P.N.: Birdsnap: Large-scale fine-grained visual categorization of birds. In: CVPR (2014)
Bo, L., Ren, X., Fox, D.: Kernel Descriptors for Visual Recognition. In: NIPS (2010)
Branson, S., Wah, C., Schroff, F., Babenko, B., Welinder, P., Perona, P., Belongie, S.: Visual recognition with humans in the loop. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 438–451. Springer, Heidelberg (2010)
Caruana, R.: Multitask learning. Mach. Learn. 28(1), 41–75 (1997)
Chai, Y., Rahtu, E., Lempitsky, V., Van Gool, L., Zisserman, A.: TriCoS: A tri-level class-discriminative co-segmentation method for image classification. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part I. LNCS, vol. 7572, pp. 794–807. Springer, Heidelberg (2012)
Chai, Y., Lempitsky, V., Zisserman, A.: Symbiotic segmentation and part localization for fine-grained categorization. In: ICCV (2013)
Chapelle, O.: Training a support vector machine in the primal. Neural. Comput. 19(5), 1155–1178 (2007)
Chen, J., Zhou, J., Ye, J.: Integrating low-rank and group-sparse structures for robust multi-task learning. In: KDD (2011)
Deng, J., Krause, J., Fei-Fei, L.: Fine-grained crowdsourcing for fine-grained recognition. In: CVPR (2013)
Duan, K., Parikh, D., Crandall, D., Grauman, K.: Discovering localized attributes for fine-grained recognition. In: CVPR (2012)
Farrell, R., Oza, O., Zhang, N., Morariu, V.I., Darrell, T., Davis, L.S.: Birdlets: Subordinate categorization using volumetric primitives and pose-normalized appearance. In: ICCV (2011)
Fergus, R., Bernal, H., Weiss, Y., Torralba, A.: Semantic label sharing for learning with many categories. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 762–775. Springer, Heidelberg (2010)
Gavves, E., Fernando, B., Snoek, C.G.M., Smeulders, A.W.M., Tuytelaars, T.: Fine-grained categorization by alignments. In: ICCV (2013)
Gong, P., Ye, J., Zhang, C.: Robust multi-task feature learning. In: KDD (2012)
Griffin, G., Perona, P.: Learning and using taxonomies for fast visual categorization. In: CVPR (2008)
Jalali, A., Ravikumar, P.D., Sanghavi, S., Ruan, C.: A dirty model for multi-task learning. In: NIPS (2010)
Kang, Z., Grauman, K., Sha, F.: Learning with whom to share in multi-task feature learning. In: ICML (2011)
Khan, F.S., Van De Weijer, J., Bagdanov, A.D., Vanrell, M.: Portmanteau vocabularies for multi-cue image representation. In: NIPS (2011)
Khosla, A., Jayadevaprakash, N., Yao, B., Fei-Fei, L.: Novel dataset for fine-grained image categorization. In: First Workshop on FGVC, CVPR (2011)
Kumar, A., Daumé III, H.: Learning task grouping and overlap in multi-task learning. In: ICML (2012)
Melacci, S., Belkin, M.: Laplacian Support Vector Machines Trained in the Primal. JMLR 12, 1149–1184 (2011)
Salakhutdinov, R., Torralba, A., Tenenbaum, J.: Learning to share visual appearance for multiclass object detection. In: CVPR (2011)
Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)
Su, H., Yu, A.W., Fei-Fei, L.: Efficient euclidean projections onto the intersection of norm balls. In: ICML (2012)
Todorovic, S., Ahuja, N.: Learning subcategory relevances for category recognition. In: CVPR (2008)
Wah, C., Branson, S., Perona, P., Belongie, S.: Multiclass recognition and part localization with humans in the loop. In: ICCV (2011)
Wang, H., Nie, F., Huang, H., Risacher, S.L., Ding, C.H.Q., Saykin, A.J., Shen, L.: Adni: Sparse multi-task regression and feature selection to identify brain imaging predictors for memory performance. In: ICCV (2011)
Wang, J., Yang, J., Yu, K., Lv, F., Huang, T.S., Gong, Y.: Locality-constrained linear coding for image classification. In: CVPR (2010)
Welinder, P., Branson, S., Mita, T., Wah, C., Schroff, F., Belongie, S., Perona, P.: Caltech-UCSD Birds 200. Tech. Rep. CNS-TR-2010-001, California Institute of Technology (2010)
Wipf, D.P., Nagarajan, S.S.: Iterative reweighted ℓ1 and ℓ2 methods for finding sparse solutions. J. Sel. Topics Signal Processing 4(2), 317–329 (2010)
Xie, L., Tian, Q., Hong, R., Yan, S., Zhang, B.: Hierarchical Part Matching for Fine-Grained Visual Categorization. In: ICCV (2013)
Yang, S., Bo, L., Wang, J., Shapiro, L.: Unsupervised Template Learning for Fine-Grained Object Recognition. In: NIPS (2012)
Yao, B., Bradski, G., Fei-Fei, L.: A codebook-free and annotation-free approach for fine-grained image categorization. In: CVPR (2012)
Yao, B., Khosla, A., Fei-Fei, L.: Combining randomization and discrimination for fine-grained image categorization. In: CVPR (2011)
Zhang, N., Farrell, R., Darrell, T.: Pose pooling kernels for sub-category recognition. In: CVPR (2012)
Zhang, N., Farrell, R., Iandola, F., Darrell, T.: Deformable part descriptors for fine-grained recognition and attribute prediction. In: ICCV (2013)
Zhou, J., Chen, J., Ye, J.: Clustered multi-task learning via alternating structure optimization. In: NIPS (2011)
Zweig, A., Weinshall, D.: Hierarchical regularization cascade for joint learning. In: ICML (2013)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Pu, J., Jiang, YG., Wang, J., Xue, X. (2014). Which Looks Like Which: Exploring Inter-class Relationships in Fine-Grained Visual Categorization. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds) Computer Vision – ECCV 2014. ECCV 2014. Lecture Notes in Computer Science, vol 8691. Springer, Cham. https://doi.org/10.1007/978-3-319-10578-9_28
Download citation
DOI: https://doi.org/10.1007/978-3-319-10578-9_28
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10577-2
Online ISBN: 978-3-319-10578-9
eBook Packages: Computer ScienceComputer Science (R0)