Journal of Visual Communication and Image Representation
Pose Depth Volume extraction from RGB-D streams for frontal gait recognition
Highlights
► We combine depth and RGB information from Kinect for frontal gait recognition. ► Key poses are extracted using depth frames registered in RGB frame coordinate system. ► A new feature named Pose Depth Volume is proposed. ► Comparative study with existing gait features has been done.
Introduction
Like several other biometric based identification methods, gait has been studied extensively as a biometric feature in recent years. An advantage of gait recognition is that, unlike other existing biometric methods like finger print detection, iris scan and face detection, gait of a subject can be recognized from a distance without active participation of the subject. This is because detailed textured information is not required in gait recognition. Capturing position variation of human limbs during walking is the main aim of gait recognition and this can be done using binary silhouettes extracted from images which may not be of very high quality. Over the years, while model based and appearance based gait recognition has captured significant attention from researchers, appearance based gait recognition entirely from the frontal view was not given much focus. Also, use of depth cameras in gait recognition is quite rare.
In this paper, we concentrate on gait recognition from frontal view (frontal gait recognition) only. An advantage of frontal gait recognition is that walking videos captured from this viewpoint do not suffer from self-occlusion due to hand swings which prevails in fronto-parallel view. Also, since the camera is positioned right in front of a walking person, videos can be captured in a narrow corridor like situation as well. However, a disadvantage of frontal gait recognition is that binary silhouettes extracted from RGB video frames cannot represent which limb (left/right) of a walking person is nearer to the camera and which one is behind. Thus, pose ambiguity cannot be adequately resolved, leading to incorrect gait recognition. This information deficiency is not present in depth images, where depth values indicate whether the right limb is forward and the left limb is backward or the other way round. Variation of depth in limb positions together with variation of shape is an important element of frontal gait recognition.
Recently developed depth cameras like Kinect [1], [2] can efficiently capture the depth variation in different human body parts while walking. But the depth video frames so obtained are quite noisy, as a result of which extracted object silhouettes are not often clean. In contrast, silhouettes extracted from the RGB video frames are much cleaner but shape variation over a gait cycle is not enough for the extraction of useful gait features. In order to capture both color and shape information in a single frame, we combine information from both the RGB and the depth video streams from Kinect to derive a new gait feature. Each silhouette from the depth frame of Kinect is projected into the RGB frame coordinates using a standard registration procedure, thereby forming a silhouette in transformed space which we term as depth registered silhouette. Previously, registration of Kinect depth and RGB frames has been used for 3D reconstruction using depth videos captured from multiple views of an object [4]. However, to the best of our knowledge, no gait recognition method exists which fuses both RGB and depth information for deriving gait features. It may be noted that, there is no publicly available frontal gait database with both color and depth video frames of walking persons recorded simultaneously. So, we have built a new database using Microsoft Kinect by capturing walking sequences of 30 individuals.
The proposed gait feature is termed as Pose Depth Volume (PDV). It is derived from a partial volumetric reconstruction of each depth registered silhouette. First, a certain number of depth key poses are estimated from a set of training samples and each frame of an entire walking sequence of a subject is classified into an appropriate depth key pose. A PDV is constructed corresponding to each such pose by averaging voxel volumes corresponding to all the frames which belong to that pose. Thus, the number of PDVs of each subject is same as the number of depth key poses. Each voxel in a PDV indicates the number of times an object voxel occurs in that position for that particular depth key pose within a complete gait cycle. A classifier is trained with gait cycles of subjects in the training data set and a different gait cycle is used for testing the accuracy of recognition.
The rest of the paper is organized as follows. Section 2 introduces the Kinect RGB-D camera and basic functionality of its different parts. A brief background study on gait is also included in this section. Section 3 illustrates the sequence of steps followed in deriving our proposed gait feature. Positioning of the Kinect camera and construction of the data set together with experimental results are presented in Section 4. Finally, Section 5 concludes the paper and points out future scope of work.
Section snippets
Basics of RGB-D Kinect camera
RGB-D cameras [2], [3] are useful for providing depth and color information of an object simultaneously. Kinect, developed by Microsoft, is one such type of camera [1]. It captures depth information through its infrared projector and sensor. The infrared laser emitted by Kinect draws a structured pattern on the object surface. The infrared camera senses the depth from this pattern using a technology which is based on the structured light principle. Apart from the infrared projector and sensor,
Gait recognition using Pose Depth Volume
In this section, we describe a new feature called Pose Depth Volume (PDV). The applicability of the feature is tested on videos captured by Microsoft Kinect. Instead of using depth videos directly as done in case of GEV, we combine RGB and depth information from Kinect to obtain better silhouettes along with depth information. To capture intrinsic dynamics of gait better than GEV, we divide an entire gait cycle into a number of depth key poses. Averaging of voxel volumes is done over all the
Experimental results
In this section, we present results from an extensive set of experiments carried out using the proposed Pose Depth Volume (PDV) feature. Experiments have been conducted in the context of biometric based identification in which the features of a test subject are matched against a gallery of previously captured and annotated feature sets. The proposed gait recognition algorithm has been implemented in Matlab 7.12.0 (R2010a) on a 2.50 GHz Intel Core i5 processor having 4 GB RAM.
Conclusions
In this paper, we have combined both depth as well as color streams from Kinect by registering Kinect depth frames to map with corresponding color frames. Next, we have introduced a novel feature called Pose Depth Volume by averaging voxel volumes corresponding to frames belonging to the same pose. The proposed feature considers both shape and depth variations of walking sequence of individuals over each depth key pose of a gait cycle. Experiments carried out on a data set comprising of 30
Acknowledgments
This work is partially funded by project Grant No. 22(0554)/11/EMR-II sponsored by the Council of Scientific and Industrial Research, Govt. of India. The authors thank the anonymous reviewers for their constructive suggestions.
References (31)
- et al.
Gait recognition based on dynamic region analysis
Signal Processing
(2008) - et al.
Frame difference energy image for gait recognition with incomplete silhouettes
Pattern Recognition Letters
(2009) - et al.
Active energy image plus 2DLPP for gait recognition
Signal Processing
(2010) - et al.
Gait recognition using pose kinematics and pose energy image
Signal Processing
(2012) Microsoft Kinect sensor and its effect
IEEE Multimedia
(2012)- Kinect for Windows....
- Sony Depth Sensing Camera: Sony Patents Kinect-Like 3D Depth-Sensing Camera for Play Station Consoles....
- R.A. Newcombe, S. Izadi, O. Hilliges, D. Molyneaux, D. Kim, A.J. Davison, P. Kohli, J. Shotton, S. Hodges, A.W....
A tutorial on hidden markov models and selected applications in speech recognition
Proceedings of the IEEE
(1989)- et al.
Gait recognition: a challenging signal processing technology for biometrics identification
IEEE Signal Processing Magazine
(2005)
The recognition of human movement using temporal templates
IEEE Transactions on Pattern Analysis and Machine Intelligence
Automated person recognition by walking and running via model-based approaches
Pattern Recognition
Identification of humans using gait
IEEE Transactions on Image Processing
Individual recognition using gait energy image
IEEE Transactions on Pattern Analysis and Machine Intelligence
Cited by (68)
BGaitR-Net: An effective neural model for occlusion reconstruction in gait sequences by exploiting the key pose information
2024, Expert Systems with ApplicationsHigh performance inference of gait recognition models on embedded systems
2022, Sustainable Computing: Informatics and SystemsVision-based approaches towards person identification using gait
2021, Computer Science ReviewGait recognition in the presence of co-variate conditions
2021, NeurocomputingGait recognition based on vision systems: A systematic survey
2021, Journal of Visual Communication and Image RepresentationPerson re-identification from appearance cues and deep Siamese features
2021, Journal of Visual Communication and Image Representation