Abstract
Adopting the Counting Grid (CG) representation [1], the Spring Lattice Counting Grid (SLCG) model uses a grid of feature counts to capture the spatial layout that a variety of images tend to follow. The images are mapped to the counting grid with their features rearranged so as to strike a balance between the mapping quality and the extent of the necessary rearrangement. In particular, the feature sets originating from different image sectors are mapped to different sub-windows in the counting grid in a configuration that is close, but not exactly the same as the configuration of the source sectors. The distribution over deformations of the sector configuration is learnable using a new spring lattice model, while the rearrangement of features within a sector is unconstrained. As a result, the CG model gains a more appropriate level of invariance to realistic image transformations like view point changes, rotations or scales. We tested SLCG on standard scene recognition datasets and on a dataset collected with a wearable camera which recorded the wearer’s visual input over three weeks. Our algorithm is capable of correctly classifying the visited locations more than 80% of the time, outperforming previous approaches to visual location recognition. At this level of performance, a variety of real-world applications of wearable cameras become feasible.
Chapter PDF
References
Jojic, N., Perina, A.: Multidimensional counting grids: Inferring word order from disordered bags of words. In: UAI 2011, pp. 547–556 (2011)
Perina, A., Jojic, N.: Image analysis by counting on a grid. In: CVPR 2011, pp. 1985–1992 (2011)
Bosch, A., Zisserman, A., Muñoz, X.: Scene Classification Via pLSA. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006, Part IV. LNCS, vol. 3954, pp. 517–530. Springer, Heidelberg (2006)
Jojic, N., Perina, A., Murino, V.: Structural Epitome: a way to summarize one’s visual experience. In: NIPS 2010, pp. 1027–1035 (2010)
Li, F.-F., Perona, P.: A Bayesian Hierarchical Model for Learning Natural Scene Categories. In: CVPR (2), pp. 524–531 (2005)
Perina, A., Cristani, M., Castellani, U., Murino, V., Jojic, N.: Free Energy score space. In: NIPS 2009, pp. 1428–1436 (2009)
Jojic, N., Frey, B.J., Kannan, A.: Epitomic analysis of appearance and shape. In: ICCV 2003, pp. 34–43 (2003)
Oliva, A., Torralba, A.: Modeling the shape of the scene: A holistic representation of the spatial envelope. Int. Jrn. of Computer Vision 42 (2001)
Lazebnik, S., Schmid, C., Ponce, J.: Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories. In: CVPR (2), pp. 2169–2178 (2006)
Lowe, D.: Distinctive Image Features from Scale-Invariant Keypoints. Int. Jrn. of Computer Vision 60 (2004)
Zhu, J., Li, L.-J., Li, F.-F., Xing, E.P.: Large Margin Learning of Upstream Scene Understanding Models. In: NIPS 2010, pp. 2586–2594 (2010)
Quattoni, A., Torralba, A.: Recognizing indoor scenes. In: CVPR 2009, pp. 413–420 (2009)
Torralba, A., Murphy, K.P., Freeman, W.T., Rubin, M.A.: Context-based vision system for place and object recognition. In: ICCV 2003, pp. 273–280 (2003)
Blei, D., Ng, A., Jordan, M.: Latent Dirichlet Allocation. Journal of Machine Learning Research 3(5) (2003)
Fergus, R., Perona, P., Zisserman, A.: Object Class Recognition by Unsupervised Scale-Invariant Learning. In: CVPR 2003 (2003)
Sudderth, E., Ihlerl, A., Isard, T., Freeman, W., Willsky, A.: Non Parametric Belief Propagation. In: CVPR 2003 (2003)
Isard, M., Pampas, M.: Real-Valued Graphical Models for Computer Vision. In: CVPR 2003 (2003)
Sudderth, E., Mandel, M., Freeman, W., Willsky, A.: Visual Hand Tracking Using Nonparametric Belief Propagation. In: CVPR 2004 Workshop on Generative Model Based Vision (2004)
Parizi, S.N., Oberlin, J., Felzenszwalb, P.F.: Reconfigurable models for scene recognition. In: CVPR 2012 (2012)
Pandey, M., Lazebnik, S.: Scene recognition and weakly supervised object localization with deformable part-based models. In: ICCV 2011 (2011)
Krahenbuhl, P., Koltun, V.: Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials. In: NIPS 2011 (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Perina, A., Jojic, N. (2012). Spring Lattice Counting Grids: Scene Recognition Using Deformable Positional Constraints. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds) Computer Vision – ECCV 2012. ECCV 2012. Lecture Notes in Computer Science, vol 7577. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33783-3_60
Download citation
DOI: https://doi.org/10.1007/978-3-642-33783-3_60
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33782-6
Online ISBN: 978-3-642-33783-3
eBook Packages: Computer ScienceComputer Science (R0)