BIM-Tracker: A model-based visual tracking approach for indoor localisation using a 3D building model

https://doi.org/10.1016/j.isprsjprs.2019.02.014Get rights and content

Abstract

This article presents an accurate and robust visual indoor localisation approach that not only is infrastructure-free, but also avoids accumulation error by taking advantage of (1) the widespread ubiquity of mobile devices with cameras and (2) the availability of 3D building models for most modern buildings. Localisation is performed by matching image sequences captured by a camera, with a 3D model of the building in a model-based visual tracking framework. Comprehensive evaluation of the approach with a photo-realistic synthetic dataset shows the robustness of the localisation approach under challenging conditions. Additionally, the approach is tested and evaluated on real data captured by a smartphone. The results of the experiments indicate that a localisation accuracy better than 10 cm can be achieved by using this approach. Since localisation errors do not accumulate the proposed approach is suitable for indoor localisation tasks for long periods of time and augmented reality applications, without requiring any local infrastructure. A MATLAB implementation can be found on https://github.com/debaditya-unimelb/BIM-Tracker.

Introduction

Indoor location information is the key enabler of a range of applications including navigation guidance, location-based services, emergency response, guiding vulnerable people and augmented reality. Indoor environments present a challenge for localisation due to strong attenuation of Global Navigation Satellite System (GNSS) signals (Mautz, 2012) compared to the outdoor environments. In the literature, many different approaches for indoor localisation have been proposed. However, the performance of indoor localisation is still lagging behind compared to outdoor localisation (Lymberopoulos and Liu, 2017), and many localisation applications are waiting for an acceptable solution.

Present indoor localisation approaches are either infrastructure-dependent, whose installation and maintenance are costly and not always feasible, or are infrastructure-free, which are not accurate enough for mass-market applications (Alarifi et al., 2016, Mautz, 2012). By infrastructure, we mean a dedicated network of sensors, transmitters or beacons installed in the indoor environment. Consequently, infrastructure-free indoor localisation has become a focus of research and development during the past decade, and improvements in indoor localisation systems are likely to generate better prospects for business. Among various infrastructure-free approaches, those based on digital television signals, FM radio signals, magnetic field, ambient sound levels and barometers provide metre level accuracy which is not sufficient for many indoor location-based applications (Xie et al., 2014, Muralidharan et al., 2014, Ye et al., 2014, Tarzia et al., 2011, Serant et al., 2011).

Other infrastructure-free methods such as pedestrian dead reckoning (PDR), visual odometry, and simultaneous localisation and mapping (SLAM), suffer from the accumulation of localisation errors resulting in the drift of the estimated trajectory (Khoshelham and Ramezani, 2017, Scaramuzza and Fraundorfer, 2011, Caron et al., 2014). The reported accuracy of the PDR using inertial measurement units (IMU) combined with recalibration from other sources such as Wi-Fi is approximately 1 m (Lymberopoulos and Liu, 2017) or 1% of the trajectory length (Mautz, 2012). A drift of 1% is acceptable for short distances but is not suitable for long distance indoor localisation applications. Moreover, visual odometry and SLAM methods are susceptible to poorly textured indoor environments such as a corridor, due to the lack of image features. Additionally, SLAM involves performing a loop closure, which is not practical for navigation applications, e.g. in a long tunnel, as the user cannot be forced to make loops.

Model-based visual tracking methods overcome the above challenges by using a 3D model to recalibrate the drift (Lepetit and Fua, 2005). Furthermore, the methods using model-based visual tracking (such as the work of Drummond and Cipolla, 2002) eliminate the requirement of textured indoor environments, and are computationally inexpensive (Lepetit and Fua, 2005). However, these model-based tracking approaches have been designed for tracking small objects and in small spaces such as a room and are therefore unsuitable for continuous localisation in large indoor environments. Moreover, the existing works that use model-based visual tracking lacks a comprehensive evaluation of the achievable accuracy, robustness and the estimated trajectory.

In this paper we present BIM-Tracker: a model-based tracking approach to indoor localisation that is based on matching images captured by a mobile device with the corresponding view of a building information model (BIM). The edges in the images are matched with the edges derived from the BIM to estimate the location of the camera in the BIM coordinate system by model-based visual tracking. The advantage of performing localisation in a BIM coordinate system is that the estimated locations are not prone to drifts compared to the incremental tracking methods that perform localisation by local motion estimation (Khoshelham and Ramezani, 2017). Because most of the smartphones and smartglasses are equipped with a camera, and the fact that a low level-of-detail 3D model of the environment is usually available or can be easily generated, the present research proposes a model-based visual tracking approach for accurate and drift-free localisation in an infrastructure-free indoor environment. The following are the main contributions of the article:

  • 1.

    We formulate an MSAC (M-estimator sample consensus) framework to use two hypotheses on either side of a back-projected model line to search the corresponding image edges. This strategy is a balance between the higher computational costs of using several multiple hypotheses for tracking and the robustness that it provides.

  • 2.

    We provide experimental insight into the optimal camera configurations and factors that contribute to errors for a model-based visual tracking approach in an indoor environment. A detailed analysis of the estimated trajectory is performed using a photo-realistic synthetic dataset with several configurations such as different image resolutions, camera field-of-views (FOV), motion blur, clutter and occlusions.

  • 3.

    We demonstrate the ability of BIM-Tracker for drift-free localisation using real images, which makes it suitable for navigation and augmented reality applications.1

The paper proceeds with a review of visual methods and related works in the field of model-based tracking in Section 2. The theory and methodology for model-based visual tracking using edges are explained in Section 3. The experimental design and the evaluation results are discussed in Section 4, followed by conclusions in Section 6.

Section snippets

Background and related work

Visual methods can be classified as visual odometry, SLAM, model-based tracking and the integration of IMU with these methods. Visual odometry (Nister et al., 2004, Scaramuzza and Fraundorfer, 2011) is a local motion estimation approach, where the motion of the camera is used to perform a incremental tracking. Consequently, errors are accumulated and the estimated locations drift from the true locations. Visual landmarks (Zhu et al., 2007) have been used to reduce drift by recalibrating the

Methodology

The design of an infrastructure-free localisation approach that can be universally adopted requires the knowledge of parameters suitable for its robust performance, considering the limitations of using such an approach on smartphones and wearable devices, and the challenges presented by a dynamic indoor environment. The main hypothesis of the research is that centimetre level accuracy in localisation can be achieved without any drift by integrating image information with a 3D building model. To

Experiments and results

During the evaluation of localisation accuracy, the correctness of the ground truth plays a vital role. Ground truth for the evaluation of trajectories in indoor spaces are usually collected by camera-based (Fod et al., 2002), motion capture (Huang et al., 2017), laser tracking (Teulire et al., 2015), surveying (Brki et al., 2010) or target tracking (Boochs et al., 2010) methods. All the methods, require specific hardware or platforms for the collection of the ground truth and usually have

Discussions

Although BIM-Tracker is quite robust against motion blur, occlusions and illumination variations, it fails under notoriously challenging conditions. Firstly, in the presence of heavy motion blur, the edges extracted by the edge detector might be missed, noisy or displaced by a few pixels. Therefore, wrong 3D-2D correspondences will be generated that results in the estimation of a bad pose. Fig. 17(a) shows one of such poses where the image blur caused the failure of tracking. Although the

Conclusion

An approach is developed for performing infrastructure-free indoor localisation by performing a model-based visual tracking. Evaluation of the approach suggests that a localisation accuracy of 10 cm can be achieved using a low level-of-detail 3D model derived from a BIM. Moreover, there is no accumulation of error, which makes this approach suitable for indoor localisation tasks for long periods. Experiments with photo-realistic synthetic data suggest that a higher resolution of the image and a

Acknowledgements

This research was supported by a Research Engagement Grant from the Melbourne School of Engineering and a Melbourne Research Scholarship. The authors would like to sincerely thank the reviewers for their invaluable and constructive suggestions that helped us to improve the quality of the research.

References (59)

  • B. Brki et al.

    Daedalus: a versatile usable digital clip-on measuring system for total stations

  • C. Cadena et al.

    Past, present, and future of simultaneous localization and mapping: Toward the robust-perception age

    IEEE Trans. Robot.

    (2016)
  • J. Canny

    A computational approach to edge detection

    IEEE Trans. Pattern Anal. Mach. Intell.

    (1986)
  • C. Choi et al.

    Real-time 3d model-based tracking using edge and keypoint features for robotic manipulation

  • A.I. Comport et al.

    A real-time tracker for markerless augmented reality

  • P. David et al.

    Softposit: simultaneous pose and correspondence determination

  • P. David et al.

    Simultaneous pose and correspondence determination using line features

  • A.J. Davison

    Real-time simultaneous localisation and mapping with a single camera

  • T. Drummond et al.

    Real-time visual tracking of complex structures

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2002)
  • M.A. Fischler et al.

    Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography

    Commun. ACM

    (1981)
  • A. Fod et al.

    A laser-based people tracker

  • X.-S. Gao et al.

    Complete solution classification for the perspective-three-point problem

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2003)
  • A.P. Gee et al.

    Real-time model-based slam using line segments

  • R. Gomez-Ojeda et al.

    Robust stereo visual odometry through a probabilistic combination of points and line segments

  • R. Gomez-Ojeda et al.

    Geometric-based line segment tracking for HDR stereo sequences

  • Gomez-Ojeda, R., Moreno, F.A., Scaramuzza, D., Jiménez, J.G., 2017. PL-SLAM: a stereo SLAM system through the...
  • Groves, P.D., 2013. Principles of GNSS, inertial, and multisensor integrated navigation systems. Artech...
  • A. Handa et al.

    A benchmark for RGB-D visual odometry, 3d reconstruction and slam

  • M. Hofer et al.

    Line-based 3d reconstruction of wiry objects

  • Cited by (38)

    • BIM-based indoor mobile robot initialization for construction automation using object detection

      2023, Automation in Construction
      Citation Excerpt :

      Therefore, it was not feasible for large-scale complex construction environments with many similar plain walls, pillars and long corridors. Many research has studied the potential of BIM in robot path planning [40] and position tracking [41] using the indoor mobility information in BIM, such as the size of areas and the accessibility to other spaces and transitions [42]. BIM was also used in infrastructure-dependent indoor localization systems to provide the environment information.

    • Geometric BIM verification of indoor construction sites by photogrammetric point clouds and evidence theory

      2023, ISPRS Journal of Photogrammetry and Remote Sensing
      Citation Excerpt :

      Furthermore, Meyer et al. (2021) assess the accuracy of image based change detection for BIM from matching uncertain image lines to corresponding 3D model edges that are also considered to be statistically uncertain. A related approach is visual localization from indoor images in combination with an available BIM based on line-features as introduced by Acharya et al. (2019). In order to manage different sources of uncertainty properly, this contribution makes use of evidence theory following Shafer (1976) and Dempster (1976).

    View all citing articles on Scopus
    View full text