Abstract
We propose a new tool to classify a video clip into one of n given classes (e.g., “news”, “commercials”, etc). The first novelty of our approach is a method to automatically derive a “vocabulary” from each class of video clips, using the powerful method of “Independent Component Analysis” (ICA). Second, the method is unified in that it works with both video and audio information, and gives vocabulary describing not only the still images, but also motion and the audio part. Furthermore, this vocabulary is natural in that it is closely related to human perceptual processing. More specifically, every class of video clips gives a list of “basis functions”, which can compress its members very well. Once we represent video clips in “vocabularies”, we can do classification and pattern discovery. For the classification of a video clip, we propose using compression: we test which of the “vocabularies” can compress the video clip best, and we assign it to the corresponding class. For data mining, we inspect the basis functions of each video genre class and identify genre characteristics such as fast motions/transitions, more harmonic audio, etc. In experiments on real data of 62 news and 43 commercial clips, our method achieved overall accuracy of ≈81%.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Horace B. Barlow. Unsupervised learning. Neural Computation, (1):295–311, 1989.
Marian Stewart Bartlett, H. Martin Lades, and Terrence J. Sejnowski. Independent component representations for face recognition. Proceedings of SPIE; Conference on Human Vision and Electronic Imaging III, January 1998.
Anthony J. Bell and Terrence J. Sejnowski. The “independent components” of natural scenes are edge filters. Vision Research, (37):3327–3338, 1997.
Nevenka Dimitrova, Lalitha Agnihotri, and Gang Wei. Video classification based on hmm using text and faces. ACMM ultimedia, 2000.
Stephan Fischer, Rainer Lienhart, and Wolfgang Effelsberg. Automatic recognition of film genres. The 3rd ACMInternational Multimedia Conference and Exhibition, 1995.
Andreas Girgensohn and Jonathan Foote. Video classification using transform coeficients. Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, 6.
Patrik O. Hoyer and Aapo Hyvarinen. A probabilistic framework for the adaptation and comparison of image codes. J. Opt. Soc. of Am. A: Optics, Image Science, and Vision, March 1999.
Aapo Hyvarinen. Fast and robust fixed-point algorithms for independent component analysis. IEEE Transactions on Neural Networks, 1999.
Aapo Hyvarinen. Survey on independent component analysis. Neural Computing Surveys, 2:94–128, 1999.
Aapo Hyvarinen. Independent component analysis: Algorithms and applications. Neural Networks, 13(4-5):411–430, 2000.
Jong-Hwan Lee, Ho-Young Jung, Te-Won Lee, and Soo-Young Lee. Speech feature extraction using independent component analysis. International Conference on Acoustics, Speech, and Signal Processing, in press, June 2000.
Te-Won Lee, Mark Girolami, Anthony J. Bell, and Terrence J. Sejnowski. A unifying information-theoretic framework for independent component analysis. International Journal on Mathematical and Computer Models, in press, 1999.
Michael S. Lewicki. Efficient coding of natural sounds. Nature Neuroscience, 5(4):356–363, April 2002.
Rainer Lienhart, Christoph Kuhmunch, and Wolfgang Effelsberg. On the detection and recognition of television commercials. Proceedings of the International Conference on Multimedia Computing and Systems, pages 509–516, 1996.
Zhu Liu, Jincheng Huang, and Yao Wang. Classification of tv programs based on audio information using hidden markov model. Proc. of 1998 IEEE Second Workshop on Multimedia Signal Processing (MMSP’98), pages 27–31, December 1998.
Zhu Liu, Yao Wang, and Tsuhan Chen. Audio feature extraction and analysis for scene classification. Journal of VLSI Signal Processing, Special issue on multimedia signal processing:61–79, October 1998.
Bruno A. Olshausen and David J. Field. Wavelet-like receptive fields emerge from a network that learns sparse codes for natural images. Nature, (381):607–609, 1996.
Matthew J. Roach and John S. Mason. Video genre classification using audio. EuroSpeech, 2001.
Matthew J. Roach, John S. Mason, and Mark Pawlewski. Video genre classification using dynamics. Int. Conf. on Acoustics, Speech and Signal Processing, 2001.
Kim Shearer, Chitra Dorai, and Svetha Venkatesh. Local color analysis for scene break detection applied to tv commercials recognition. Proc. 3rd. Intl. Conf. on Visual Information and Information Systems (VISUAL’99), pages 237–244, June 1999.
Ba Tu Truong, Svetha Venkatesh, and Chitra Dorai. Automatic genre identifi cation for content-based video categorization. International Conference Pattern Recognition, 4:230–233, 2000.
J. H. van Hateren and D. L. Ruderman. Independent component analysis of natural image sequences yields spatio-temporal filters similar to simple cells in primary visual cortex. Proc. Royal Society Lond. B, (265):2315–2320, 1998.
Howard Wactlar, Michael Christel, Y. Gong, and A. Hauptmann. Lessons learned from the creation and deployment of a terabyte digital video library. IEEE Computer, 32(2):66–73, February 1999.
Xiang Sean Zhou, Baback Moghaddam, and Thomas S. Huang. Ica-based probabilistic local appearance models. IEEE International Conference on Image Processing (ICIP), October 2001.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Pan, JY., Faloutsos, C. (2002). VideoCube: A Novel Tool for Video Mining and Classification. In: Lim, E.P., et al. Digital Libraries: People, Knowledge, and Technology. ICADL 2002. Lecture Notes in Computer Science, vol 2555. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36227-4_20
Download citation
DOI: https://doi.org/10.1007/3-540-36227-4_20
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-00261-1
Online ISBN: 978-3-540-36227-2
eBook Packages: Springer Book Archive