Abstract
This paper addresses visual object perception applied to mobile robotics. Being able to perceive household objects in unstructured environments is a key capability in order to make robots suitable to perform complex tasks in home environments. However, finding a solution for this task is daunting: it requires the ability to handle the variability in image formation in a moving camera with tight time constraints. The paper brings to attention some of the issues with applying three state of the art object recognition and detection methods in a mobile robotics scenario, and proposes methods to deal with windowing/segmentation. Thus, this work aims at evaluating the state-of-the-art in object perception in an attempt to develop a lightweight solution for mobile robotics use/research in typical indoor settings.
Similar content being viewed by others
References
Aldavert, D., Ramisa, A., Toledo, R., Mantaras, R.: Fast and robust object segmentation with the integral linear classifier. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1046–1053 (2010). doi:10.1109/CVPR.2010.5540098
Bianchi, R., Ramisa, A., Mantaras, R.: Automatic Selection of Object Recognition Methods Using Reinforcement Learning Recent Advances in Machine Learning (dedicated to the memory of Prof. Ryszard S. Michalski). Springer Studies in Computational Inteligence, vol. 262, pp. 421–439 (2010)
Ramisa, A.: Localization and object recognition for mobile robots. Ph.D. thesis, Autonomous University of Barcelona (2009)
Canny, J.: A computational approach to edge detection. IEEE Trans. Pattern. Anal. Mach. Intell. 6, 679–698 (1986)
Collet, A., Berenson, D., Srinivasa, S., Ferguson, D.: Object recognition and full pose registration from a single image for robotic manipulation. In: IEEE International Conference on Robotics and Automation, pp. 48–55 (2009)
Csurka, G., Bray, C., Dance, C., Fan, L.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision, ECCV, pp. 1–22 (2004)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 886–893 (2005)
Dietterich, T.G., Bakiri, G.: Solving multiclass learning problems via error-correcting output codes. J. Artif. Intell. Res. 2, 263–286 (1995)
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL visual object classes challenge 2007 (VOC2007) results. http://www.pascal-network.org/challenges/VOC/voc2007/workshop/index.html (2007)
Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories. In: Workshop on Generative-Model Based Vision. IEEE Computer Society (2004)
Felzenszwalb, P., McAllester, D., Ramanan, D.: A discriminatively trained, multiscale, deformable part model. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2008)
Fulkerson, B., Vedaldi, A., Soatto, S.: Localizing objects with smart dictionaries. In: European Conference on Computer Vision, pp. 179–192 (2008)
Galindo, C., Saffiotti, A., Coradeschi, S., Buschka, P., Fernandez-Madrigal, J., González, J.: Multi-hierarchical semantic maps for mobile robotics. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS, pp. 2278–2283 (2005)
Grauman, K., Darrell, T.: The pyramid match kernel: discriminative classification with sets of image features. In: International Conference on Computer Vision, pp. 1458–1465 (2005)
Huang, C., Ai, H., Wu, B., Lao, S.: Boosting nested cascade detector for multi-view face detection. In: International Conference on Pattern Recognition, pp. 415–418 (2004)
Jensfelt, P., Ekvall, S., Kragic, D., Aarno, D.: Augmenting slam with object detection in a service robot framework. In: The 15th IEEE International Symposium on Robot and Human Interactive Communication, 2006, ROMAN 2006, pp. 741–746 (2006)
Jones, M., Viola, P.: Fast multi-view face detection. In: IEEE Conference on Computer Vision and Pattern Recognition (2003)
Lampert, C.H., Blaschko, M.B., Hofmann, T.: Beyond sliding windows: object localization by efficient subwindow search. In: IEEE Conference on Computer Vision and Pattern Recognition (2008)
Leibe, B., Leonardis, A., Schiele, B.: Robust object detection with interleaved categorization and segmentation. Int. J. Comput. Vis. 77(1–3), 259–289 (2008)
Lienhart, R., Kuranov, E., Pisarevsky, V.: Empirical analysis of detection cascades of boosted classifiers for rapid object detection. In: DAGM 25th Pattern Recognition Symposium, pp. 297–304 (2003)
Lowe, D.G.: Object recognition from local scale-invariant features. In: International Conference on Computer Vision, vol. 2, p. 1150 (1999)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Mansur, A., Kuno, Y.: Specific and class object recognition for service robots through autonomous and interactive methods. IEICE - Trans. Inf. Syst. E91-D(6), 1793–1803 (2008). doi:10.1093/ietisy/e91-d.6.1793
Martinez Mozos, O., Triebel, R., Jensfelt, P., Rottmann, A., Burgard, W.: Supervised semantic labeling of places using information extracted from sensor data. Robot. Auton. Syst. 55(5), 391–402 (2007)
Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. IEEE Trans. Pattern Anal. Mach. Intell. 27(10), 1615–1630 (2005)
Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., Kadir, T., Gool, L.V.: A comparison of affine region detectors. Int. J. Comput. Vis. 65(1/2), 43–72 (2005)
Moosmann, F., Nowak, E., Jurie, F.: Randomized clustering forests for image classification. IEEE Trans. Pattern Anal. Mach. Intell. 30(9), 1632–1646 (2008)
Muja, M., Lowe, D.: Fast approximate nearest neighbors with automatic algorithm configuration. In: International Conference on Computer Vision Theory and Applications (VISAPP’09) (2009)
Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In: IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 2161–2168 (2006)
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2007)
Pinto, N., Cox, D.D., Dicarlo, J.J.: Why is real-world visual object recognition hard? PLoS Comput. Biol. 4(1), e27+ (2008). doi:10.1371/journal.pcbi.0040027
Porikli, F.: Integral histogram: a fast way to extract histograms in cartesian spaces. In: IEEE Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 829–836 (2005)
Ramisa, A., Vasudevan, S., Scaramuzza, D., de Mántaras, R.L., Siegwart, R.: A tale of two object recognition methods for mobile robots. In: Gasteratos, A., Vincze, M., Tsotsos, J.K. (eds.) ICVS. Lecture Notes in Computer Science, vol. 5008, pp. 353–362. http://dblp.uni-trier.de/db/conf/icvs/icvs2008.html#RamisaVSMS08
Sivic, J., Zisserman, A.: Video google: a text retrieval approach to object matching in videos. Int. Conf. Comput. Vis. 2, 1470–1477 (2003)
Torralba, A., Murphy, K., Freeman, W.: Sharing visual features for multiclass and multiview object detection. IEEE. Trans. Pattern Anal. Mach. Intell. 29, 854–869 (2007)
Vasudevan, S., Gachter, S., Nguyen, V., Siegwart, R.: Cognitive maps for mobile robots - an object based approach. Robot. Auton. Syst. 55(5), 359–371 (2007). From Sensors to Human Spatial Concepts
Vazquez, E., van de Weijer, J., Baldrich, R.: Image segmentation in the presence of shadows and highlights. In: European Conference on Computer Vision, vol. 4, pp. 1–14 (2008)
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: IEEE Conference on Computer Vision and Pattern Recognition, vol. 1, p. 511 (2001)
Viola, P., Jones, M., Snow, D.: Detecting pedestrians using patterns of motion and appearance. Int. J. Comput. Vis. 63, 153161 (2005)
Zhang, Z., Li, M., Li, S.Z., Zhang, H.: Multi-view face detection with floatboost. In: IEEE Workshop on Applications of Computer Vision, p. 184 (2002). doi:10.1109/ACV.2002.1182179
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ramisa, A., Aldavert, D., Vasudevan, S. et al. Evaluation of Three Vision Based Object Perception Methods for a Mobile Robot. J Intell Robot Syst 68, 185–208 (2012). https://doi.org/10.1007/s10846-012-9675-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10846-012-9675-8