Loop closure detection in SLAM by combining visual and spatial appearance

doi:10.1016/j.robot.2006.04.016

Robotics and Autonomous Systems

Volume 54, Issue 9, 30 September 2006, Pages 740-749

https://doi.org/10.1016/j.robot.2006.04.016 Get rights and content

Abstract

In this paper we describe a system for use on a mobile robot that detects potential loop closures using both visual and spatial appearance of local scenes. Loop closing is the act of correctly asserting that a vehicle has returned to a previously visited location. Current approaches rely heavily on vehicle pose estimates to prompt loop closure. Paradoxically, these approaches are least reliable when the need for accurate loop closure detection is the greatest. Our underlying approach relies instead upon matching distinctive ‘signatures’ of individual local scenes to prompt loop closure. A key advantage of this method is that it is entirely independent of the navigation and or mapping process and so is entirely unaffected by gross errors in pose estimation. Another advantage, which is explored in this paper, is the possibility to enhance robustness of loop closure detection by incorporating heterogeneous sensory observations. We show how a description of local spatial appearance (using laser rangefinder data) can be combined with visual descriptions to form multi-sensory signatures of local scenes which enhance loop-closure detection.

Section snippets

Introduction and motivation

SLAM (simultaneous localization and mapping) is a core information engineering problem in mobile robotics and has received much attention in past years especially regarding its estimation theoretic aspects [1], [17], [25]. Good progress has been made but SLAM is still far from being an established and reliable technology. A big problem is a lack of robustness. This is markedly manifested during what has become known as loop closing. It is common practice to use estimates produced by a SLAM

Derivation of visual signature

We begin by describing how to derive a visual signature of a local scene. Periodically, an image of the surrounding environment will be captured. This image is unique to the specific pose of the robot with respect to the environment. The goal is to reduce this image into a set of descriptors. Every image set of descriptors is compared against each other to determine similarity between local scenes. Many authors have successfully used visual landmarks in SLAM, for example [20], [4]. In this

Derivation of spatial signature

A laser scan can be considered to be top-view image of the geometric structure of the environment. Though most efforts have concentrated on extracting shape descriptors of 2D objects in images [26], [2], [10] have applied their shape similarity system to the problem of robot localization and mapping in recognition of the similarity in these two problems. We begin by describing how a complete laser patch is passed through a pipeline of processes resulting in a set of descriptors that encode the

Segment descriptor comparison

We now describe how two segment descriptors generated according to Section 3.2 can be compared to one another. Each segment (a node in the graph of Fig. 4) contains the CAF function, its entropy measure and a list of critical points. Considering two such nodes we use three disparity measures based on their properties.

Matching of spatial descriptors

Our shape similarity metric comprises of two parts: the shape similarity between two segments and the spatial similarity between segments. The quality of match between segment $S_{i}$ from the query scan and segment $S_{j}^{'}$ from the reference scan is defined by $Q_{i j} = λ η_{S_{i}, S_{j}^{'}} + (1 - λ) \times q_{m}$ , where $Q_{i j} \in [0, 1]$ and the parameter $λ \in [0, 1]$ , determines the relative importance attached to the matching of the shape of segments and the links between segments. It was determined experimentally that the value of 0.3 for $λ$

Experimental results

To examine and demonstrate the effectiveness of our approach, we tested our algorithm in an outdoor environment. The ATRV-Jnr mobile robot was driven around a car park in front of a building. The vehicle camera kept a constant orientation in vehicle coordinates — looking forward and slightly to the left. Every two seconds an image was grabbed and written to disk. The vehicle was equipped with a standard SICK laser, the output of which was also logged along with the odometry from the wheel

Conclusions

We have developed a system which uses both spatial and visual appearance to guide and aid the detection of loop-closure events. We described how spatial shape information may be encoded and compared using entropy and relative entropy respectively. The spatial matching process is designed to be robust to occlusion and viewpoint changes. It uses a redundant number of transformations between salient features on segment boundaries. Finally, overall spatial similarity between two laser patches is

Kin Leong Ho received his B.Sc. in Systems Engineering with Distinction from the United States Naval Academy in 2003. He is a DPhil student at Oxford University Robotics Research Group under the sponsorship of the Rhodes Scholarship. Currently, he is serving as a naval combat officer in the Republic of Singapore Navy. His research interests are in the area of mobile robotics, namely vision based navigation, cooperative robotics and simultaneous localization and mapping.

References (27)

L. Latecki et al.
Application of planar shape comparison to object retrieval in image databases
Pattern Recognition
(2002)
M. Bosse et al.
SLAM in large-scale cyclic environments using the atlas framework
International Journal of Robotics Research
(2004)
S. Belongie et al.
Shape matching and object recognition using shape contexts
IEEE Transactions on Pattern Analysis and Machine Intelligence
(2002)
S. Cohen, L. Guibas, Partial matching of planar polylines under similarity transformation, in: Eighth Annual ACM-SIAM...
A. Davison et al.
Simultaneous localization and map-building using active vision
Pattern Analysis Machine Intelligence
(2002)
R. Hinkel, T. Knieriemen, Environment perception with a laser radar in a fast moving robot, in: Proceedings of...
K. Ho, P. Newman, Multiple map intersection detection using visual appearance, in: International Conference on...
G. Jones et al.
Retrieving spoken documents by combining multiple index sources
Research and Development in Information Retrieval
(1996)
T. Kadir et al.
Saliency, scale and image description
International Journal Computer Vision
(2001)
K. Konolige, Large-Scale Map-Making, in: Proceedings of the National Conference on AI (AAAI), San Jose, CA,...

L. Latecki, R. Lakämper, D. Wolter, Shape similarity and visual parts, in: International Conference on Discrete...

D.G. Lowe et al.

Mobile robot localization and mapping with uncertainty using scale-invariant visual landmarks

International Journal of Robotics Research

(2002)

D.G. Lowe

Distinctive image features from scale-invariant keypoints

International Journal of Computer Vision

(2004)

Cited by (0)

Paul Newman obtained an M.Eng. in Engineering Science from Oxford University in 1995. After a brief sojourn in the telecommunications industry in 1996 he undertook a Ph.D. in autonomous navigation at the University of Sydney, Australia. In 1999 he returned to the United Kingdom to work in the commercial sub-sea navigation industry. In late 2000 he joined the Department of Ocean Engineering at M.I.T. where as a post-doc and later a Research Scientist, he worked on algorithms and software for robust autonomous navigation for both land and sub-sea agents. In early 2003 he was appointed to a Departmental, and in 2005, University Lectureship in Information Engineering at the Department of Engineering Science, Oxford University. He heads the Mobile Robotics Research group and has research interests in pretty much anything to do with autonomous navigation.

View full text

Loop closure detection in SLAM by combining visual and spatial appearance

Abstract

Section snippets

Introduction and motivation

Derivation of visual signature

Derivation of spatial signature

Segment descriptor comparison

Matching of spatial descriptors

Experimental results

Conclusions

Pattern Recognition

SLAM in large-scale cyclic environments using the atlas framework

International Journal of Robotics Research

Shape matching and object recognition using shape contexts

IEEE Transactions on Pattern Analysis and Machine Intelligence

Simultaneous localization and map-building using active vision

Pattern Analysis Machine Intelligence

Retrieving spoken documents by combining multiple index sources

Research and Development in Information Retrieval

Saliency, scale and image description

International Journal Computer Vision

Mobile robot localization and mapping with uncertainty using scale-invariant visual landmarks

International Journal of Robotics Research

Distinctive image features from scale-invariant keypoints

International Journal of Computer Vision