Fine-Grain Level Sports Video Search Engine

Song, Zikai; Yu, Junqing; Cai, Hengyou; Hu, Yangliu; Chen, Yi-Ping Phoebe

doi:10.1007/978-3-030-37731-1_42

Fine-Grain Level Sports Video Search Engine

Zikai Song¹⁶,
Junqing Yu¹⁶,
Hengyou Cai¹⁶,
Yangliu Hu¹⁶ &
…
Yi-Ping Phoebe Chen¹⁷

Conference paper
First Online: 24 December 2019

2738 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11961))

Abstract

It becomes an urgent demand how to make people find relevant video content of interest from massive sports videos. We have designed and developed a sports video search engine based on distributed architecture, which aims to provide users with content-based video analysis and retrieval services. In sports video search engine, we focus on event detection, highlights analysis and image retrieval. Our work has several advantages: (I) CNN and RNN are used to extract features and integrate dynamic information and a new sliding window model are used for multi-length event detection. (II) For highlights analysis. An improved method based on self-adapting dual threshold and dominant color percentage are used to detect the shot boundary. Affect arousal method are used for highlights extraction. (III) For image’s indexing and retrieval. Hyper-spherical soft assignment method is proposed to generate image descriptor. Enhanced residual vector quantization is presented to construct multi-inverted index. Two adaptive retrieval methods based on hype-spherical filtration are used to improve the time efficient. (IV) All of previous algorithms are implemented in the distributed platform which we develop for massive video data processing.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Geetha, P., Narayanan, V.: A survey of content-based video retrieval. J. Comput. Sci. 4(6), 734 (2008)
Google Scholar
Chao, Y.W., Vijayanarasimhan, S., Seybold, B., et al.: Rethinking the faster R-CNN architecture for temporal action localization. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1130–1139 (2018)
Google Scholar
Lin, T., Zhao, X., Su, H., et al.: BSN: boundary sensitive network for temporal action proposal generation. In: European Conference on Computer Vision (ECCV), pp. 3–19 (2018)
Chapter Google Scholar
Ramanathan, V., Huang, J., Abu-El-Haija, S., et al.: Detecting events and key actors in multi-person videos. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3043–3053 (2016)
Google Scholar
Ibrahim, M.S., Muralidharan, S., Deng, Z., et al.: A hierarchical deep temporal model for group activity recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1971–1980 (2016)
Google Scholar
Lea, C., Flynn, M.D., Vidal, R., et al.: Temporal convolutional networks for action segmentation and detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1003–1012 (2017)
Google Scholar
Krishna, R., Hata, K., Ren, F., et al.: Dense-captioning events in videos. In: IEEE International Conference on Computer Vision (ICCV), pp. 706–715 (2017)
Google Scholar
Hanjalic, A.: Adaptive extraction of highlights from a sport video based on excitement modeling. IEEE Trans. Multimedia 7(6), 1114–1122 (2005)
Article Google Scholar
Sivic, J., Zisserman, A.: Video Google: a text retrieval approach to object matching in video. In: IEEE International Conference on Computer Vision (ICCV), pp. 1470–1477 (2003)
Google Scholar
Jegou, H., Douze, M., Schmid, C., et al.: Aggregating local descriptors into a compact image representation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3304–3311 (2010)
Google Scholar
Wengert, C., Douze, M., Jegou, H.: Bag-of-colors for improved image search. In: 19th ACM International Conference on Multimedia, pp. 1437–1440 (2011)
Google Scholar
Jegou, H., Douze, M., Schmid, C.: Product quantization for nearest neighbor search. IEEE Trans. Pattern Anal. Mach. Intell. 33(1), 117–128 (2011)
Article Google Scholar
Tavenard, R., Jegou, H., Amsaleg, L.: Balancing clusters to reduce response time variability in large scale image search. In: 9th International Workshop on Content-Based Multimedia Indexing (CBMI), pp. 19–24 (2011)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167 (2015)
He, K., Zhang, X., Ren, S., et al.: Deep Residual Learning for Image Recognition. arXiv preprint arXiv:1512.03385 (2015)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Xu, K., Ba, J., Kiros, R., et al.: Show, attend and tell: neural image caption generation with visual attention. arXiv preprint arXiv:1502.03044 (2015)
Yu, J., Wang, N.: Shot classification for soccer video based on sub-window region. J. Image Graph. 1006–8961 (2008). 07-1347-06
Google Scholar
Gan, C., et al.: DevNet: a deep event network for multimedia event detection and evidence recounting. In: IEEE Conference on Computer Vision and Pattern Recognition (2015)
Google Scholar
Donahue, J., et al.: Long-term recurrent convolutional networks for visual recognition and description. In: IEEE Conference on Computer Vision and Pattern Recognition (2015)
Google Scholar
Buch, S., et al.: SST: single-stream temporal action proposals. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Google Scholar
Yu, J.Q., Lei, A.P., Song, Z.K., et al.: Comprehensive dataset of broadcast soccer videos. In: IEEE International Conference on Multimedia Information Processing and Retrieval (2018)
Google Scholar
Jegou, H., Douze, M., Schmid, C.: Hamming embedding and weak geometric consistency for large scale image search. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008. LNCS, vol. 5302, pp. 304–317. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88682-2_24
Chapter Google Scholar
Herve, J., Matthijs, D., Cordelia, S., et al.: Aggregating local descriptors into a compact image representation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3304–3311 (2010)
Google Scholar

Download references

Acknowledgments

We gratefully acknowledge the granted financial support from the National Natural Science Foundation of China (No. 61572211, 61173114, 61202300).

Author information

Authors and Affiliations

School of Computer Science and Technology, Center of Network and Computation, Huazhong University of Science and Technology, Wuhan, China
Zikai Song, Junqing Yu, Hengyou Cai & Yangliu Hu
La Trobe University, Melbourne, VIC, 3086, Australia
Yi-Ping Phoebe Chen

Authors

Zikai Song
View author publications
You can also search for this author in PubMed Google Scholar
Junqing Yu
View author publications
You can also search for this author in PubMed Google Scholar
Hengyou Cai
View author publications
You can also search for this author in PubMed Google Scholar
Yangliu Hu
View author publications
You can also search for this author in PubMed Google Scholar
Yi-Ping Phoebe Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Junqing Yu .

Editor information

Editors and Affiliations

Korea Advanced Institute of Science and, Daejeon, Korea (Republic of)
Yong Man Ro
National Chiao Tung University, Hsinchu, Taiwan
Wen-Huang Cheng
Korea Advanced Institute of Science and Technology, Daejeon, Korea (Republic of)
Junmo Kim
National Cheng Kung University, Tainan City, Taiwan
Wei-Ta Chu
Tsinghua University, Beijing, China
Peng Cui
Korea Advanced Institute of Science and Technology, Daejeon, Korea (Republic of)
Jung-Woo Choi
National Tsing Hua University, Hsinchu, Taiwan
Min-Chun Hu
Ghent University, Ghent, Belgium
Wesley De Neve

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Song, Z., Yu, J., Cai, H., Hu, Y., Chen, YP.P. (2020). Fine-Grain Level Sports Video Search Engine. In: Ro, Y., et al. MultiMedia Modeling. MMM 2020. Lecture Notes in Computer Science(), vol 11961. Springer, Cham. https://doi.org/10.1007/978-3-030-37731-1_42

Download citation

DOI: https://doi.org/10.1007/978-3-030-37731-1_42
Published: 24 December 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-37730-4
Online ISBN: 978-3-030-37731-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics