skip to main content
10.1145/1386352.1386358acmconferencesArticle/Chapter ViewAbstractPublication PagescivrConference Proceedingsconference-collections
research-article

Identifying relevant frames in weakly labeled videos for training concept detectors

Authors Info & Claims
Published:07 July 2008Publication History

ABSTRACT

A key problem with the automatic detection of semantic concepts (like 'interview' or 'soccer') in video streams is the manual acquisition of adequate training sets. Recently, we have proposed to use online videos downloaded from portals like youtube.com for this purpose, whereas tags provided by users during video upload serve as ground truth annotations.

The problem with such training data is that it is weakly labeled: Annotations are only provided on video level, and many shots of a video may be "non-relevant", i.e. not visually related to a tag. In this paper, we present a probabilistic framework for learning from such weakly annotated training videos in the presence of irrelevant content. Thereby, the relevance of keyframes is modeled as a latent random variable that is estimated during training.

In quantitative experiments on real-world online videos and TV news data, we demonstrate that the proposed model leads to a significantly increased robustness with respect to irrelevant content, and to a better generalization of the resulting concept detectors.

References

  1. D. Borth, A. Ulges, C. Schulze, and T. Breuel. Keyframe Extraction for Video Tagging and Summarization. In GI--Informatiktage, 2008.Google ScholarGoogle Scholar
  2. M. Campbell, A. Haubold, M. Liu, A. Natsev, J. Smith, and J. Tesic. IBM Research TRECVID--2007 Video Retrieval System. In TRECVID Workshop, Gaithersburg, USA, November 2007.Google ScholarGoogle Scholar
  3. A. Dempster, N. Laird, and D. Rubin. Maximum Likelihood from Incomplete Data via the EM algorithm. Journal of the Royal Statistical Society, 39(1):1--38, 1977.Google ScholarGoogle Scholar
  4. T. Deselaers, D. Keysers, and H. Ney. Discriminative Training for Object Recognition Using Image Patches. In CVPR, pages 157--162, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. R. Duda, P. Hart, and D. Stork. Pattern Classification (2nd Edition). Wiley Interscience Publications, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. R. Fergus, L. Fei-Fei, P. Perona, and A. Zisserman. Learning Object Categories from Google’s Image Search. Computer Vision, 2:1816--1823, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. R. Fergus, P. Perona, and A. Zisserman. Object Class Recognition by Unsupervised Scale-Invariant Learning. In CVPR, pages 264--271, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  8. T. Hofmann. Unsupervised Learning by Probabilistic Latent Semantic Analysis. Machine Learning, 42:177--196, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. S. Lazebnik, C. Schmid, and J. Ponce. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories. In CVPR, pages 2169--2178, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. K. Mikolajczyk, R. Mohr, and C. Bauckhage. Evaluation of Interest Point Detectors. Intern. J. Compt. Vis., 37(2):1--38, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. K. Mikolajczyk and C. Schmid. A Performance Evaluation of Local Descriptors. In CVPR, pages 257--263, 2007.Google ScholarGoogle Scholar
  12. A. Opelt, M. Fussenegger, and P. Auer. Generic Object Recognition with Boosting. IEEE Trans. Pattern Anal. Mach. Intell., 28(3):416--431, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J. Philbin, O. Chum, J. Sivic, V. Ferrari, M. Marin, A. Bosch, N. Apostolof, and A. Zisserman. Oxford TRECVID 2007, Notebook paper. In TRECVID Workshop, 2007.Google ScholarGoogle Scholar
  14. C. Rosenberg and M. Hebert. Training Object Detection Models with Weakly Labeled Data. In BMVC, 2002.Google ScholarGoogle ScholarCross RefCross Ref
  15. J. Sivic, B. Russell, A. Efros, and A. Zisserman. Discovering Objects and their Locations in Images. In ICCV, pages 370--377, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. C. G. M. Snoek, I. Everts, J. C. van Gemert, J.-M. Geusebroek, B. Huurnink, D. C. Koelma, M. van Liempt, O. de Rooij, K. E. A. van de Sande, A. W. M. Smeulders, J. R. R. Uijlings, and M. Worring. The MediaMill TRECVID 2007 Semantic Video Search Engine. In TRECVID Workshop, November 2007.Google ScholarGoogle Scholar
  17. H. Tamura, S. Mori, and T. Yamawaki. Textural Features Corresponding to Visual Perception. IEEE Trans. on Sys., Man, Cybern., 6(8):460--472, 1978.Google ScholarGoogle ScholarCross RefCross Ref
  18. A. Ulges, C. Schulze, D. Keysers, and T. M. Breuel. Content-Based Video Tagging for Online Video Portals. In MUSCLE/Image-CLEF Workshop, Budapest, 2007.Google ScholarGoogle Scholar
  19. A. Ulges, C. Schulze, D. Keysers, and T. M. Breuel. A System that Learns to Tag Videos by Watching Youtube. In ICVS (accepted for publication), 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. C. Yang and T. Lozano-Perez. Image Database Retrieval with Multiple-Instance Learning Techniques. In Int. Conf. on Data Eng., 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. J. Zhang, M. Marszalek, S. Lazebnik, and C. Schmid. Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study. International Journal of Computer Vision, 73(2):213--238, June 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Identifying relevant frames in weakly labeled videos for training concept detectors

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      CIVR '08: Proceedings of the 2008 international conference on Content-based image and video retrieval
      July 2008
      674 pages
      ISBN:9781605580708
      DOI:10.1145/1386352

      Copyright © 2008 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 7 July 2008

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader