RGB-D image-based detection of stairs, pedestrian crosswalks and traffic signs

https://doi.org/10.1016/j.jvcir.2013.11.005Get rights and content

Highlights

  • A new framework can be integrated into navigation and wayfinding aid for blind people.

  • Effective methods to detect and recognize stairs and pedestrian crosswalks based on RGB-D images.

  • A new method for recognizing traffic lights of pedestrian crosswalks.

Abstract

A computer vision-based wayfinding and navigation aid can improve the mobility of blind and visually impaired people to travel independently. In this paper, we develop a new framework to detect and recognize stairs, pedestrian crosswalks, and traffic signals based on RGB-D (Red, Green, Blue, and Depth) images. Since both stairs and pedestrian crosswalks are featured by a group of parallel lines, we first apply Hough transform to extract the concurrent parallel lines based on the RGB (Red, Green, and Blue) channels. Then, the Depth channel is employed to recognize pedestrian crosswalks and stairs. The detected stairs are further identified as stairs going up (upstairs) and stairs going down (downstairs). The distance between the camera and stairs is also estimated for blind users. Furthermore, the traffic signs of pedestrian crosswalks are recognized. The detection and recognition results on our collected datasets demonstrate the effectiveness and efficiency of our proposed framework.

Introduction

Independent travel and active interactions with the dynamic surrounding environment are well known to present significant challenges for individuals with severe vision impairment, thereby reducing quality of life and compromising safety. In order to improve the ability of people who are blind or have significant visual impairments to access, understand, and explore surrounding environments, many assistant technologies and devices have been developed to accomplish specific navigation goals, obstacle detection, or wayfinding tasks.

Many electronic mobility assistant systems are developed based on converting sonar information into an audible signal for the visually impaired persons to interpret [3], [10], [12], [13], [18]. However, they only provide limited information. Recently, researchers have focused on interpreting the visual information into a high level representation before sending it to the visually impaired persons. Coughlan and Shen [6] developed a method of finding crosswalks based on figure-ground segmentation, which they built in a graphical model framework for grouping geometric features into a coherent structure. Ivanchenko et al. [9] further extended the algorithm to detect the location and orientation of pedestrian crosswalks for a blind or visually impaired person using a cell phone camera. The prototype of the system can run in real time on an off-the-shelf Nokia N95 camera phone. The cell phone automatically took several images per second, analyzed each image in a fraction of a second and sounded an audio tone when it detected a pedestrian crosswalk. Advanyi et al. [1] employed the Bionic eyeglasses to provide the blind or visually impaired individuals the navigation and orientation information based on an enhanced color preprocessing through mean shift segmentation. Then detection of pedestrian crosswalks was carried out via a partially adaptive Cellular Nanoscale Networks algorithm. Se [24] proposed a method to detect zebra crosswalks. They first detected the crossing lines by looking for groups of concurrent lines. Edges were then partitioned using intensity variation information. Se and Brady [25] also developed a Gabor filter based texture detection method to detect distant stair cases. When the stairs are close enough, stair cases were then detected by looking for groups of concurrent lines, where convex and concave edges were portioned using intensity variation information. The pose of stairs was also estimated by a homograph search model. The “vOICe” system [29] is a commercially available vision-based travel aid that converts image information to sound. The system contains a head-mounted camera, stereo headphones and a laptop. Uddin and Shioyama [30] proposed a bipolarity-based segmentation and projective invariant-based method to detect zebra crosswalks. They first segmented the image on the basis of bipolarity and selected the candidates on the basis of area, then extracted feature points on the candidate area based on the Fisher criterion. The authors recognized zebra crosswalks based on the projective invariants. Omachi and Omachi [19] proposed an image-based traffic sign detection method using image shape and color information. They further improved their method by adding the traffic sign structure based on Hough transform as a critical factor [20]. Charette and Nashashibi implemented a real time image processing system for traffic sign based on the generic “adaptive templates” [5]. Alefs and Eschemann designed a computer vision-based system for reliable road sign detection [2]. The detection system is based on the method of feature selection and matching using edge orientation histograms [4]. Everingham et al. [8] developed a wearable mobility aid for people with low vision using scene classification in a Markov random field model framework. They segmented an outdoor scene based on color information and then classified the regions of sky, road, buildings etc. Lausser et al. [14] introduced a visual zebra crossing detector based on the Viola–Jones approach. Pallejà et al. developed an Electronic White Cane based on a LIDAR, a Tri-Axial Accelerometer and a Tactile Belt to help blind people [21]. Shoval et al. [23] discussed the use of mobile robotics technology in the Guide-Cane device, a wheeled device pushed ahead of the user via an attached cane for the blind to avoid obstacles. When the Guide-Cane detects an obstacle, it steers around it. The user immediately feels this steering action and can follow the Guide-Cane’s new path. Tian’s group has developed a series of computer vision-based methods for blind people to independently access and navigate unfamiliar environments [26], [27], [28], [31], [32], [33], [35], [36].

In this paper, we propose a computer vision-based framework to detect stair-cases, pedestrian crosswalks and traffic signs. The computer vision-based wayfinding and navigation aid for blind persons integrates an RGB-D camera [21], a microphone, a portable computer, and a speaker connected by Bluetooth for audio description of objects identified. In our prototype system, an HP mini laptop is employed to conduct image processing and data analysis. The RGB-D camera mounted on the user’s belt is used to capture videos of the environment and connected to the HP mini laptop via a USB connection. The presence of environmental objects (stairs, crosswalks, traffic signs, etc.) is described to the blind user by a verbal display with minimal distraction to the user’s sense of hearing. The user can control the system by speech input via microphone. The recent introduction of the cost-effective RGB-D cameras eases the task [17], [22]. We employ the RGB-D cameras based on the following advantages: (a) RGB-D cameras contain both an RGB camera and a 3D depth camera which can provide more information of the scene; (b) they work well in a low light environment; (c) they are low-cost (about US$150); and (d) they are efficient for real-time processing. The RGB-D camera captures RGB images at a resolution of 640 × 480 and depth points at 30 frames per second. The RGB-D cameras field of view is about 60°. Compared to existing work [15], [16], [34] of staircase detection which only depend on RGB videos or stereo cameras, our proposed method is more robust and efficient. More importantly, our framework integrates multiple functions including detection of staircases, crosswalks, and traffic signs.

As shown in Fig. 1, our whole framework consists of two main components: (a) detection and recognition of stairs and pedestrian crosswalks; and (b) detection and recognition for crosswalk “red” traffic signs. For stair and crosswalk detection and recognition, first, a group of parallel lines are detected via Hough transform and line fitting with geometric constraints from RGB information. In order to distinguish stairs and pedestrian crosswalks, we extract the feature of one dimensional depth information according to the direction of the longest detected line from the depth image. Then the feature of one dimensional depth information is employed as the input of a support vector machine (SVM) based classifier [4] to recognize stairs and pedestrian crosswalks. For stairs, a further detection of the upstairs and downstairs is conducted. Furthermore, we estimate the distance between the camera and stairs for the blind user. For crosswalk “red” traffic sign detection and recognition, first, a scale-invariant template matching method with a color filter is applied to RGB images to detect the candidates of the “red” traffic signs. Then the Histograms of Oriented Gradients (HOG) features [7] are extracted as the input of a SVM-based classifier to further distinguish the “red” traffic signs and the “non-red” traffic signs.

The reminder of paper is organized as follows. Section 2 describes the methodology of our proposed algorithm which includes the following main components: (1) detection of stair-cases or pedestrian crosswalks based on RGB image analysis; (2) classification between stairs from pedestrian crosswalks; (3) recognition of upstairs and downstairs. Section 3 introduces the proposed method for pedestrian traffic sign detection and recognition. Section 4 evaluates the effectiveness and efficiency of proposed method. Section 5 summarizes the paper.

Section snippets

Detecting candidates of pedestrian crosswalks and stairs from RGB images

There are various kinds of stair-cases and pedestrian crosswalks. In this paper, we focus on stair cases with uniform trend and steps, and pedestrian crosswalks of the most regular zebra crosswalks with alternating white bands. In our application of blind navigation and wayfinding, we focus on detecting stairs or pedestrian crosswalks in a close distance.

Stairs consist of a sequence of steps which can be regarded as a group of consecutive curb edges, and pedestrian crosswalks can be

Method overview

Blind and visually impaired pedestrians face critical safety challenges while crossing street at intersections, especially in unfamiliar environments. In this paper, we focus on crosswalk traffic sign detection and recognition. As shown in Fig. 1(b), our method of crosswalk traffic sign detection and recognition contains four main steps: (1) Preprocessing of the original input images which includes down-sampling the original images and Gaussian smoothing to remove noise. (2) Detection of the

Stair and crosswalk database

To evaluate the effectiveness and efficiency of the proposed method, we have collected a database for stair and crosswalk detection and recognition. The database is divided into two sub-datasets: a testing dataset and a training dataset. The training dataset contains 30 images for each category (i.e. upstairs, downstairs, crosswalks, and nagative images which contain neither stairs nor pedestrian crosswalks) which are randomly selected to train the SVM classifiers. Then the remaining images are

Conclusions and future work

We have developed a framework for automatic detection of stairs, pedestrian crosswalks and traffic signs from images to improve the travel safeness of the blind and visually impaired people. Our proposed framework has been evaluated on the databases of stairs, pedestrian crosswalks and traffic signs, and has achieved average accuracy rates of 91.1% for detecting stairs and pedestrian crosswalks from scene images, 95.8% for classification of stairs and pedestrian crosswalks, 90.3% for

Acknowledgments

This work was supported in part by NSF Grants EFRI-1137172, IIP-1343402, FHWA DTFH61-12-H-00002, ARO Grant W911NF-09-1-0565, and Microsoft Research. The authors thank the anonymous reviewers for their constructive comments and insightful suggestions that improved the quality of this paper.

References (36)

  • R. Advanyi, B. Varga, K. Karacs, Advanced crosswalk detection for the bionic eyeglass, in: 12th International Workshop...
  • B. Alefs, G. Eschemann, H. Ramoser, C. Beleznai, Road sign detection from edge orientation histogram, in: IEEE...
  • M. Bousbia-Salah, A. Redjati, M. Fezari, M. Bettayeb, An ultrasonic navigation system for blind people, in: IEEE...
  • C. Chang, C. Lin, LIBSVM: a library for support vector machine, 2001,...
  • R. Charette, F. Nashashibi, Traffic light recognition using image processing compared to learning processes, in: IEEE...
  • J. Coughlan, H. Shen, A fast algorithm for finding crosswalks using figure-ground segmentation. in: The 2nd Workshop on...
  • N. Dalal et al.

    Histograms of oriented gradients for human detection

    Comput. Vision Pattern Recognit.

    (2005)
  • M. Everingham et al.

    Wearable mobility aid for low vision using scene classification in a Markov random field model framework

    Int. J. Human Comput. Interact.

    (2003)
  • V. Ivanchenko, J. Coughlan, H. Shen, Detecting and locating crosswalks using a camera phone, computers helping people...
  • G. Kao, FM sonar modeling for navigation, Technical Report, Department of Engineering Science, University of Oxford,...
  • H. Kim et al.

    Grayscale template-matching invariant to rotation, scale, translation, brightness and contrast

    Adv. Image Video Technol.

    (2007)
  • R. Kuc

    A sonar aid to enhance spatial perception of the blind: engineering design and evaluation

    IEEE Trans. Biomed. Eng.

    (2002)
  • B. Laurent et al.

    A sonar system modeled after spatial hearing and echolocating bats for blind mobility aid

    Int. J. Phys. Sci.

    (2007)
  • L. Lausser, F. Schwenker, G. Palm, Detecting zebra crossings utilizing AdaBoost, in: European Symposium on Artificial...
  • Y. Lee, T. Leung, G. Medioni, Real-time staircase detection from a wearable stereo system, in: ICPR,...
  • X. Lu, R. Manduchi, Detection and localization of curbs and stairways using stereo vision, in: ICRA,...
  • Microsoft, 2010,...
  • C. Morland, D. Mountain, Design of a sonar system for visually impaired humans, in: The 14th International Conference...
  • Cited by (0)

    View full text