VideoCube: A Novel Tool for Video Mining and Classification

Pan, Jia-Yu; Faloutsos, Christos

doi:10.1007/3-540-36227-4_20

VideoCube: A Novel Tool for Video Mining and Classification

Jia-Yu Pan⁶ &
Christos Faloutsos⁶

Conference paper
First Online: 16 December 2002

1264 Accesses
14 Citations

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2555))

Abstract

We propose a new tool to classify a video clip into one of n given classes (e.g., “news”, “commercials”, etc). The first novelty of our approach is a method to automatically derive a “vocabulary” from each class of video clips, using the powerful method of “Independent Component Analysis” (ICA). Second, the method is unified in that it works with both video and audio information, and gives vocabulary describing not only the still images, but also motion and the audio part. Furthermore, this vocabulary is natural in that it is closely related to human perceptual processing. More specifically, every class of video clips gives a list of “basis functions”, which can compress its members very well. Once we represent video clips in “vocabularies”, we can do classification and pattern discovery. For the classification of a video clip, we propose using compression: we test which of the “vocabularies” can compress the video clip best, and we assign it to the corresponding class. For data mining, we inspect the basis functions of each video genre class and identify genre characteristics such as fast motions/transitions, more harmonic audio, etc. In experiments on real data of 62 news and 43 commercial clips, our method achieved overall accuracy of ≈81%.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Horace B. Barlow. Unsupervised learning. Neural Computation, (1):295–311, 1989.
Google Scholar
Marian Stewart Bartlett, H. Martin Lades, and Terrence J. Sejnowski. Independent component representations for face recognition. Proceedings of SPIE; Conference on Human Vision and Electronic Imaging III, January 1998.
Google Scholar
Anthony J. Bell and Terrence J. Sejnowski. The “independent components” of natural scenes are edge filters. Vision Research, (37):3327–3338, 1997.
Google Scholar
Nevenka Dimitrova, Lalitha Agnihotri, and Gang Wei. Video classification based on hmm using text and faces. ACMM ultimedia, 2000.
Google Scholar
Stephan Fischer, Rainer Lienhart, and Wolfgang Effelsberg. Automatic recognition of film genres. The 3rd ACMInternational Multimedia Conference and Exhibition, 1995.
Google Scholar
Andreas Girgensohn and Jonathan Foote. Video classification using transform coeficients. Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, 6.
Google Scholar
Patrik O. Hoyer and Aapo Hyvarinen. A probabilistic framework for the adaptation and comparison of image codes. J. Opt. Soc. of Am. A: Optics, Image Science, and Vision, March 1999.
Google Scholar
Aapo Hyvarinen. Fast and robust fixed-point algorithms for independent component analysis. IEEE Transactions on Neural Networks, 1999.
Google Scholar
Aapo Hyvarinen. Survey on independent component analysis. Neural Computing Surveys, 2:94–128, 1999.
Google Scholar
Aapo Hyvarinen. Independent component analysis: Algorithms and applications. Neural Networks, 13(4-5):411–430, 2000.
Article Google Scholar
Jong-Hwan Lee, Ho-Young Jung, Te-Won Lee, and Soo-Young Lee. Speech feature extraction using independent component analysis. International Conference on Acoustics, Speech, and Signal Processing, in press, June 2000.
Google Scholar
Te-Won Lee, Mark Girolami, Anthony J. Bell, and Terrence J. Sejnowski. A unifying information-theoretic framework for independent component analysis. International Journal on Mathematical and Computer Models, in press, 1999.
Google Scholar
Michael S. Lewicki. Efficient coding of natural sounds. Nature Neuroscience, 5(4):356–363, April 2002.
Article Google Scholar
Rainer Lienhart, Christoph Kuhmunch, and Wolfgang Effelsberg. On the detection and recognition of television commercials. Proceedings of the International Conference on Multimedia Computing and Systems, pages 509–516, 1996.
Google Scholar
Zhu Liu, Jincheng Huang, and Yao Wang. Classification of tv programs based on audio information using hidden markov model. Proc. of 1998 IEEE Second Workshop on Multimedia Signal Processing (MMSP’98), pages 27–31, December 1998.
Google Scholar
Zhu Liu, Yao Wang, and Tsuhan Chen. Audio feature extraction and analysis for scene classification. Journal of VLSI Signal Processing, Special issue on multimedia signal processing:61–79, October 1998.
Google Scholar
Bruno A. Olshausen and David J. Field. Wavelet-like receptive fields emerge from a network that learns sparse codes for natural images. Nature, (381):607–609, 1996.
Google Scholar
Matthew J. Roach and John S. Mason. Video genre classification using audio. EuroSpeech, 2001.
Google Scholar
Matthew J. Roach, John S. Mason, and Mark Pawlewski. Video genre classification using dynamics. Int. Conf. on Acoustics, Speech and Signal Processing, 2001.
Google Scholar
Kim Shearer, Chitra Dorai, and Svetha Venkatesh. Local color analysis for scene break detection applied to tv commercials recognition. Proc. 3rd. Intl. Conf. on Visual Information and Information Systems (VISUAL’99), pages 237–244, June 1999.
Google Scholar
Ba Tu Truong, Svetha Venkatesh, and Chitra Dorai. Automatic genre identifi cation for content-based video categorization. International Conference Pattern Recognition, 4:230–233, 2000.
Google Scholar
J. H. van Hateren and D. L. Ruderman. Independent component analysis of natural image sequences yields spatio-temporal filters similar to simple cells in primary visual cortex. Proc. Royal Society Lond. B, (265):2315–2320, 1998.
Google Scholar
Howard Wactlar, Michael Christel, Y. Gong, and A. Hauptmann. Lessons learned from the creation and deployment of a terabyte digital video library. IEEE Computer, 32(2):66–73, February 1999.
Google Scholar
Xiang Sean Zhou, Baback Moghaddam, and Thomas S. Huang. Ica-based probabilistic local appearance models. IEEE International Conference on Image Processing (ICIP), October 2001.
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science Department, Carnegie Mellon University, 15213, Pittsburgh, PA, USA
Jia-Yu Pan & Christos Faloutsos

Authors

Jia-Yu Pan
View author publications
You can also search for this author in PubMed Google Scholar
Christos Faloutsos
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Nanyang Technological University, Singapore
Ee- Peng Lim , Schubert Foo & Chris Khoo , &
University of Arizona, USA
Hsinchun Chen
Virginia Tech, USA
Edward Fox
University of Mysore, Mysore
Shalini Urs
IEI-CNR, Italy
Thanos Costantino

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pan, JY., Faloutsos, C. (2002). VideoCube: A Novel Tool for Video Mining and Classification. In: Lim, E.P., et al. Digital Libraries: People, Knowledge, and Technology. ICADL 2002. Lecture Notes in Computer Science, vol 2555. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36227-4_20

Download citation

DOI: https://doi.org/10.1007/3-540-36227-4_20
Published: 16 December 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-00261-1
Online ISBN: 978-3-540-36227-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics