State-of-the-art on spatio-temporal information-based video retrieval
Section snippets
Spatio-temporal information for video retrieval
Content-based video retrieval is a very important area of research and several practical systems have been developed over the last decade with the aim of improving retrieval performance and tested on large-scale databases such as TRECVID http://www-nlpir.nist.gov/projects/tvpubs/tv.pubs.org.html. Video classification and retrieval problems can be hierarchically categorised with a taxonomy, an example of which is presented by Roach et al. [1]. A key characteristic of video data is its associated
Spatial representation
Spatial information can be formulated with the following two methodologies:
- •
The first approach is to use weak spatial constraints and capture spatial local information to represent low-level texture features. Examples include Gabor wavelets [22], local histograms [23], co-occurrence matrices [24], colour correlograms [25], composite region templates (CRTs) [26], etc.
- •
The second approach is to represent global qualitative spatial relations that support high-level semantic textual queries.
Visual appearance-based representation and recognition
Spatio-temporal motion-based recognition has the wide spectrum of applications in surveillance, automation, health and medical systems, etc., through perceptual identification of biometrics, activity recognition. For example, motion analysis can be used in sports and athletic training, e.g., analysing tennis strokes. For instance, the discrimination between different tennis strokes is investigated by Yamato et al. [88]. Motion-based recognition can be employed in obstacle avoidance of moving
Spatio-temporal video indexing systems
An integrated system for spatio-temporal video retrieval is LucentVision. LucentVision [12] was developed at the Visual Communications Research Department within Bell Labs. It was effectively used for tennis video indexing through spatio-temporal activity maps. This system analyses video from multiple cameras in real-time and captures the activity of the players and the ball in the form of motion trajectories. The system stores these trajectories in a database along with video, 3D models of the
Conclusions
Video retrieval is essentially the task of finding the most similar video based on a query video. Traditionally, text-based labels attached to videos were used for matching. Since the 1980s, significant research into image analysis opened up the possibility of extracting image content information from these videos which could form the basis of matching, ranking and retrieving them. Over the recent years, it has been recognised that raw pixel information and basic statistical features of colour
About the Author—SAMEER SINGH is Professor of Autonomous Systems in the Department of Computer Science, and is the Director of Research School of Informatics, Loughborough University, UK. He also heads Computer Vision and Autonomous Systems research group at Loughborough with more than 55 members. His main research focus is on the development of novel sensor data analysis and machine learning techniques that can support semi- and fully automated intelligent systems for transportation, security
References (171)
- et al.
Image classification and querying using composite region templates
J. Comput. Vision Image Understand.
(1999) - et al.
Retrieval of similar pictures on pictorial databases
Pattern Recognition
(1991) - et al.
Spatial reasoning and similarity retrieval of images using 2D C-string knowledge representation
Pattern Recognition
(1992) - et al.
2D C-String: a new spatial knowledge representation for image database systems
Pattern Recognition
(1990) - et al.
Using 2D C+-string as spatial knowledge representation for image database systems
Pattern Recognition
(1994) - et al.
Spatial reasoning and similarity retrieval for image database systems based on RS-strings
Pattern Recognition
(1996) - et al.
Signature file as a spatial filter for iconic image database
J. Visual Lang. Comput.
(1992) - et al.
3D C-string: a new spatio-temporal knowledge representation for video database systems
Pattern Recognition
(2002) - et al.
Video data indexing by 2D C-Trees
J. Visual Lang. Comput.
(1998) Temporal reasoning based on semi-intervals
Artif. Intell.
(1992)
Deformable spatio-temporal shape models: extending ASM to 2D+ time
J. Image Vision Comput.
Querying video libraries
J. Visual Commun. Image Representation
Description schemes for video programs, users and devices
Signal Process. Image Commun.
Human motion analysis: a review
Comput. Vision Image Understand.
A shape recognition scheme based on relative distances of feature points from the centroid
Pattern Recognition
Recent trends in video analysis: a taxonomy of video classification problems
Multi-training support vector machine for image retrieval
IEEE Trans. Image Process.
Asymmetric bagging and random subspace for support vector machine-based relevance feedback in image retrieval
IEEE Trans. Pattern Anal. Mach. Intell.
Direct kernel biased discriminant analysis: a new content-based image retrieval relevance feedback algorithm
IEEE Trans. Multimedia
Negative samples analysis in relevance feedback
IEEE Trans. Knowl. Data Eng.
Which components are important for interactive image searching?
IEEE Trans. Circuits Systems Video Technol.
Symbolic description and visual querying of image sequences using spatio-temporal logic
IEEE Trans. Knowl. Data Eng.
CORE: a content-based retrieval engine for multimedia information systems
ACM Multimedia Systems
Toward a conceptual model for the analysis of spatio-temporal processes
Toward semantics for modelling spatio-temporal processes within GIS
Instantly indexed multimedia databases of real world events
IEEE Trans. Multimedia
Monocular depth perception from optical flow by space–time signal processing
Proc. R. Soc. London Ser. B Biol. Sci.
Spatiotemporal energy models for the perception of motion
J. Opt. Soc. Am.
Epipolar plane image analysis: an approach to determining structure from motion
Int. J. Comput. Vision
Generalizing epipolar plane image analysis on the spatiotemporal surface
Int. J. Comput. Vision
Real-time tracking of moving persons by exploiting spatiotemporal image slices
IEEE Trans. Pattern Anal. Mach. Intell.
Qualitative spatio-temporal analysis using an oriented energy representation
Proc. Eur. Conf. Comput. Vision
SEMCOG: a hybrid object-based image and video database system and its modeling, language, and query processing
Theory Prac. Object System (TAPOS)
Maintaining knowledge about temporal intervals
Commun. ACM
Texture features for image classification
IEEE Trans. Systems Man Cybernet.
Iconic indexing by 2D strings
IEEE Trans. Pattern Anal. Mach. Intell.
Point-set topological spatial relations
Int. J. Geogr. Inf. System
Design and evaluation of algorithms for image retrieval by spatial similarity
ACM Trans. Inf. Systems
An intelligent image database system
IEEE Trans. Software Eng.
Representation and retrieval of symbolic pictures using generalized 2D strings
Proc. Visual Commun. Image Process. IV SPIE
A generalized approach to image indexing and retrieval based on 2-D strings
Cited by (0)
About the Author—SAMEER SINGH is Professor of Autonomous Systems in the Department of Computer Science, and is the Director of Research School of Informatics, Loughborough University, UK. He also heads Computer Vision and Autonomous Systems research group at Loughborough with more than 55 members. His main research focus is on the development of novel sensor data analysis and machine learning techniques that can support semi- and fully automated intelligent systems for transportation, security and surveillance, mobile phone networks, and biomedical applications. These diverse applications are complex in nature, depend heavily on advances in machine learning and sensor technology for solving problems, and can benefit enormously from automation. Over the last two decades, Prof. Singh has worked at the interface between computer science, engineering, health sciences and mathematics to develop novel algorithms in the areas of computer vision (quantitative evaluation of image enhancement, evolutionary approaches to object tracking, novelty detection in video sequences, optimisation of image analysis tools, classifying human dynamics, audio–video fusion, and handwriting recognition), and machine learning (multi-resolution pattern recognition, pareto-evolutionary neural networks, sensor fusion, predictive systems, and multi-objective optimisation). Most of this research has been published in various IEEE Transactions and other leading journals. Altogether, Prof. Singh has published over 170 papers in his career, and currently has more than £2 million research grant income to support his research team. His work is supported by several leading companies, for example HP Labs, Motorola, Corus Rail, QinetiQ, and government agencies working on transport and national security. He is also highly active in serving on various conference committees, and journals. Notably, he is currently serving as Editor-in-Chief of Pattern Analysis and Applications journal by Springer, and is Associate Editor of IEEE Transactions on SMC B, IEEE Transactions on Knowledge and Data Engineering, Real Time Image Analysis, and Neural Computing and Applications journal.
About the Author—MANEESHA SINGH was born in India. She received the B.S. degree in computer science from Kurukshetra University, Kurukshetra, India and the M.Phil. and Ph.D. degrees from the University of Exeter, Exeter, UK in 1999, 2001 and 2004, respectively. Her Ph.D. was in the area of machine learning for image analysis in aviation security. Her main research interests include image processing, natural scene analysis, video analysis, and neural networks. She has published more than 30 papers in the area of machine learning for image analysis in peer reviewed journals and conferences. Currently she is a Senior Research Fellow at Loughborough University leading the project on imaging for road transport applications.
About the Author—WEI REN graduated from University of Exeter with a Ph.D. in 2005 in the area of spatiotemporal analysis for video retrieval. Her key research interests are in the areas of image analysis, neural networks and machine learning. During the course of her Ph.D. she developed novel framework for video retrieval and a publicly available benchmark Minerva for video retrieval. She is currently working as a Post-Doctoral research in Peking University.