Skip to main content

Fine-Grain Level Sports Video Search Engine

  • Conference paper
  • First Online:
  • 2738 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11961))

Abstract

It becomes an urgent demand how to make people find relevant video content of interest from massive sports videos. We have designed and developed a sports video search engine based on distributed architecture, which aims to provide users with content-based video analysis and retrieval services. In sports video search engine, we focus on event detection, highlights analysis and image retrieval. Our work has several advantages: (I) CNN and RNN are used to extract features and integrate dynamic information and a new sliding window model are used for multi-length event detection. (II) For highlights analysis. An improved method based on self-adapting dual threshold and dominant color percentage are used to detect the shot boundary. Affect arousal method are used for highlights extraction. (III) For image’s indexing and retrieval. Hyper-spherical soft assignment method is proposed to generate image descriptor. Enhanced residual vector quantization is presented to construct multi-inverted index. Two adaptive retrieval methods based on hype-spherical filtration are used to improve the time efficient. (IV) All of previous algorithms are implemented in the distributed platform which we develop for massive video data processing.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Geetha, P., Narayanan, V.: A survey of content-based video retrieval. J. Comput. Sci. 4(6), 734 (2008)

    Google Scholar 

  2. Chao, Y.W., Vijayanarasimhan, S., Seybold, B., et al.: Rethinking the faster R-CNN architecture for temporal action localization. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1130–1139 (2018)

    Google Scholar 

  3. Lin, T., Zhao, X., Su, H., et al.: BSN: boundary sensitive network for temporal action proposal generation. In: European Conference on Computer Vision (ECCV), pp. 3–19 (2018)

    Chapter  Google Scholar 

  4. Ramanathan, V., Huang, J., Abu-El-Haija, S., et al.: Detecting events and key actors in multi-person videos. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3043–3053 (2016)

    Google Scholar 

  5. Ibrahim, M.S., Muralidharan, S., Deng, Z., et al.: A hierarchical deep temporal model for group activity recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1971–1980 (2016)

    Google Scholar 

  6. Lea, C., Flynn, M.D., Vidal, R., et al.: Temporal convolutional networks for action segmentation and detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1003–1012 (2017)

    Google Scholar 

  7. Krishna, R., Hata, K., Ren, F., et al.: Dense-captioning events in videos. In: IEEE International Conference on Computer Vision (ICCV), pp. 706–715 (2017)

    Google Scholar 

  8. Hanjalic, A.: Adaptive extraction of highlights from a sport video based on excitement modeling. IEEE Trans. Multimedia 7(6), 1114–1122 (2005)

    Article  Google Scholar 

  9. Sivic, J., Zisserman, A.: Video Google: a text retrieval approach to object matching in video. In: IEEE International Conference on Computer Vision (ICCV), pp. 1470–1477 (2003)

    Google Scholar 

  10. Jegou, H., Douze, M., Schmid, C., et al.: Aggregating local descriptors into a compact image representation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3304–3311 (2010)

    Google Scholar 

  11. Wengert, C., Douze, M., Jegou, H.: Bag-of-colors for improved image search. In: 19th ACM International Conference on Multimedia, pp. 1437–1440 (2011)

    Google Scholar 

  12. Jegou, H., Douze, M., Schmid, C.: Product quantization for nearest neighbor search. IEEE Trans. Pattern Anal. Mach. Intell. 33(1), 117–128 (2011)

    Article  Google Scholar 

  13. Tavenard, R., Jegou, H., Amsaleg, L.: Balancing clusters to reduce response time variability in large scale image search. In: 9th International Workshop on Content-Based Multimedia Indexing (CBMI), pp. 19–24 (2011)

    Google Scholar 

  14. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

  15. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167 (2015)

  16. He, K., Zhang, X., Ren, S., et al.: Deep Residual Learning for Image Recognition. arXiv preprint arXiv:1512.03385 (2015)

  17. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  18. Xu, K., Ba, J., Kiros, R., et al.: Show, attend and tell: neural image caption generation with visual attention. arXiv preprint arXiv:1502.03044 (2015)

  19. Yu, J., Wang, N.: Shot classification for soccer video based on sub-window region. J. Image Graph. 1006–8961 (2008). 07-1347-06

    Google Scholar 

  20. Gan, C., et al.: DevNet: a deep event network for multimedia event detection and evidence recounting. In: IEEE Conference on Computer Vision and Pattern Recognition (2015)

    Google Scholar 

  21. Donahue, J., et al.: Long-term recurrent convolutional networks for visual recognition and description. In: IEEE Conference on Computer Vision and Pattern Recognition (2015)

    Google Scholar 

  22. Buch, S., et al.: SST: single-stream temporal action proposals. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)

    Google Scholar 

  23. Yu, J.Q., Lei, A.P., Song, Z.K., et al.: Comprehensive dataset of broadcast soccer videos. In: IEEE International Conference on Multimedia Information Processing and Retrieval (2018)

    Google Scholar 

  24. Jegou, H., Douze, M., Schmid, C.: Hamming embedding and weak geometric consistency for large scale image search. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008. LNCS, vol. 5302, pp. 304–317. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88682-2_24

    Chapter  Google Scholar 

  25. Herve, J., Matthijs, D., Cordelia, S., et al.: Aggregating local descriptors into a compact image representation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3304–3311 (2010)

    Google Scholar 

Download references

Acknowledgments

We gratefully acknowledge the granted financial support from the National Natural Science Foundation of China (No. 61572211, 61173114, 61202300).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Junqing Yu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Song, Z., Yu, J., Cai, H., Hu, Y., Chen, YP.P. (2020). Fine-Grain Level Sports Video Search Engine. In: Ro, Y., et al. MultiMedia Modeling. MMM 2020. Lecture Notes in Computer Science(), vol 11961. Springer, Cham. https://doi.org/10.1007/978-3-030-37731-1_42

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-37731-1_42

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-37730-4

  • Online ISBN: 978-3-030-37731-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics