Multi-camera networks offer potentials for a variety of novel human-centric applications through provisioning of rich visual information. Local processing of acquired video at the source camera facilitates operation of scalable vision networks by avoiding transfer of raw images. Additional motivation for distributed processing stems from an effort to preserve privacy of the network users while offering services in applications such as assisted living. Yet another benefit of processing the images at the source is the flexibility it offers on the type of features and the level of data exchange between the cameras in a collaborative processing framework. In such a framework data fusion can occur across the three dimensions of 3D space (multiple views), time, and feature levels.
In this chapter collaborative processing and data fusion mechanisms are examined in the context of a human pose estimation framework. For efficient collaboration between the cameras under a low-bandwidth communication constraint, only concise descriptions of extracted features instead of raw images are communicated. A 3D human body model is employed as the convergence point of the spatiotemporal and feature fusion. The model also serves as a bridge between the vision network and the high-level reasoning module, which can extract gestures and interpret them against the user's context and behavior models to arrive at system-level decisions. The human body model also allows the cameras to interact with one another on the initialization of feature extraction parameters, or to evaluate the relative value of their derived features.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aghajan, H., Augusto, J., Wu, C., McCullagh, P., Walkden, J.: Distributed vision-based accident management for assisted living. In: ICOST 2007. Nara, Japan
Gottfried, B., Guesgen, H.W., Hübner, S.: Designing Smart Homes, chap. Spatiotemporal Reasoning for Smart Homes, pp. 16–34. Springer, Berlin Heidelberg New York (2006)
Cheung, K.M., Baker, S., Kanade, T.: Shape-from-silhouette across time: Part ii: Applications to human modeling and markerless motion tracking. International Journal of Computer Vision 63(3), 225–245 (2005)
Deutscher, J., Blake, A., Reid, I.: Articulated body motion capture by annealed particle filtering. International Journal of Computer Vision II, 126–133 (2000)
Dimitrijevic, M., Lepetit, V., Fua, P.: Human body pose detection using bayesian spatio-temporal templates. Computer Vision and Image Understanding 104(2), 127–139 (2006). DOI http://dx.doi.org/10.1016/j.cviu.2006.07.007
Felzenszwalb, P.F., Huttenlocher, D.P.: Efficient matching of pictorial structures. In: CVPR (2000)
Ye, G. Corso, J.J., Hager, G.D.: Real-Time Vision for Human-Computer Interaction, chap. 7: Visual Modeling of Dynamic Gestures Using 3D Appearance and Motion Features, pp. 103–120. Springer, Berlin Heidelberg New York (2005). URL gyeHCI2005.pdf
Gavrila, D.M., Davis, L.S.: 3-D model-based tracking of humans in action: A multi-view approach. In: CVPR (1996)
Hilton, A., Beresford, D., Gentils, T., Smith, R., Sun, W., Illingworth, J.: Whole-body modelling of people from multi-view images to populate virtual worlds. Visual Computer International Journal of Computer Graphics 16(7), 411–436 (2000)
Ivecovic, S., Trucco, E.: Human body pose estimation with pso. In: IEEE Congress on Evolutionary Computation, pp. 1256–1263 (2006)
Kwolek, B.: Visual system for tracking and interpreting selected human actions. In: WSCG (2003)
Liu, Y., Collins, R., Tsin, Y.: Gait sequence analysis using frieze patterns. In: Proceedings of the 7th European Conference on Computer Vision (ECCV’02) (2002)
Ménier, C., Boyer, E., Raffin, B.: 3d skeleton-based body pose recovery. In: Proceedings of the 3rd International Symposium on 3D Data Processing, Visualization and Transmission, Chapel Hill, USA (2006). URL http://perception.inrialpes.fr/Publications/2006/MBR06
Mikic, I., Trivedi, M., Hunter, E., Cosman, P.: Human body model acquisition and tracking using voxel data. International Journal of Computer Vision 53(3), 199–223 (2003). DOI http://dx.doi.org/10.1023/A:1023012723347
Muendermann, L., Corazza, S., Andriacchi, T.: The evolution of methods for the capture of human movement leading to markerless motion capture for biomechanical applications. Journal of NeuroEngineering and Rehabilitation 3(1) (2006)
Patil, R., Rybski, P.E., Kanade, T., Veloso, M.M.: People detection and tracking in high resolution panoramic video mosaic. In: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), vol. 1, pp. 1323–1328 (2004)
Robertson, C., Trucco, E.: Human body posture via hierarchical evolutionary optimization. In: BMVC’06, p. III:999 (2006)
Robertson, C., Trucco, E.: Human body posture via hierarchical evolutionary optimization. In: BMVC’06, p. III:999 (2006)
Rui, Y., Anandan, P.: Segmenting visual actions based on spatio-temporal motion patterns. pp. I:111–118 (2000)
Sidenbladh, H., Black, M.: Learning the statistics of people in images and video 54(1–3), 183–209 (2003)
Sidenbladh, H., Black, M.J., Fleet, D.J.: Stochastic tracking of 3d human figures using 2d image motion. In: ECCV ’00: Proceedings of the 6th European Conference on Computer Vision-Part II, pp. 702–718. Springer, Berlin, Heidelberg, New York (2000)
Sidenbladh, H., Black, M.J., Sigal, L.: Implicit probabilistic models of human motion for synthesis and tracking. In: ECCV ’02: Proceedings of the 7th European Conference on Computer Vision-Part I, pp. 784–800. Springer, London, UK (2002)
Sigal, L., Bhatia, S., Roth, S., Black, M.J., Isard, M.: Tracking loose-limbed people. In: CVPR (2004)
Sigal, L., Black, M.J.: Predicting 3d people from 2d pictures. In: IV Conference on Articulated Motion and Deformable Objects (2006)
Sminchisescu, C., Triggs, B.: Kinematic jump processes for monocular 3d human tracking. In: CVPR (2003)
Tabar, A.M., Keshavarz, A., Aghajan, H.: Smart home care network using sensor fusion and distributed vision-based reasoning. In: ACM Multimedia Workshop on VSSN (2006)
Weiss, Y., Adelson, E.: Perceptually organized em: A framework for motion segmentaiton that combines information about form and motion. Technical Report 315, M.I.T Media Lab (1995). URL citeseer.ist.psu.edu/article/weiss95perceptually.html
Wilson, A.D., Bobick, A.F.: Parametric hidden markov models for gesture recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 21(9), 884–900 (1999). URL citeseer.ist.psu.edu/wilson99parametric.html
Wu, C., Aghajan, H.: Layered and collaborative gesture analysis in multi-camera networks. In: ICASSP (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Aghajan, H., Wu, C., Kleihorst, R. (2008). Distributed Vision Networks for Human Pose Analysis. In: Mandic, D., Golz, M., Kuh, A., Obradovic, D., Tanaka, T. (eds) Signal Processing Techniques for Knowledge Extraction and Information Fusion. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-74367-7_10
Download citation
DOI: https://doi.org/10.1007/978-0-387-74367-7_10
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-74366-0
Online ISBN: 978-0-387-74367-7
eBook Packages: EngineeringEngineering (R0)